英文:
insert huge data via select statement in SQL
问题
我正在通过选择语句将数据插入到一个表中,但是那个选择语句有大约2500万条数据,所以花费了5-6个小时的时间并且失败了。所以我们能分批次插入相同的数据吗?
请提出建议。
我不能更改查询,我需要插入整个数据。
英文:
I am inserting the data into one table via select statement , But that select statement has huge data like 25milion , so its taking long time 5-6 hours and failing , So Can we insert the same data like batch wise ?
Kindly suggest.
I can not change the query, I need to insert entire data.
答案1
得分: 1
-
首先,您必须确定它花费了多少时间。如果您正在进行任何表连接,它很可能是在“SELECT”部分。使用“v$active_session_history”数据或某个使用ASH的性能可视化工具来确定哪个计划步骤花费了最多的时间,然后通过修改或暗示您的查询来诊断和解决问题。这是一个大课题,不是在这个论坛中可以进一步解释的事情。您可能需要向您的数据库管理员寻求帮助。
-
如果时间全部用于插入加载步骤本身,请检查索引的数量。如果有很多索引,您可能希望在加载时禁用它们,然后在加载完成后重建它们。
-
确保您正在使用并行处理(假设您有足够的CPU,并且已经询问了您的DBA哪种并行度适合您的系统),并且对于插入部分,您已经启用了并行DML:
alter session enable parallel dml;
insert /*+ append parallel(8) */ into tbl SELECT ...
- 确保您的进程实际上没有被阻塞并且没有在做无用的事情。再次,"v$active_session_history"将告诉您主要的等待事件,如果它是一个并发类等待(enqueue、pin、由于表空间已满而挂起等等),那么您需要解决阻塞问题。
我相信有一个简单的解决方案。我经常看到Oracle世界中的2500万行的INSERT SELECT只需大约十几秒的时间。在Oracle世界中,这实际上不是一个非常大的数字。
英文:
Tuning is a huge subject, and requires a lot of data and knowledge of your environment, so we can only give guidelines here.
-
You must first determine where it is spending it's time. It could very well be in the
SELECT
portion, if you are doing any table joins. Usev$active_session_history
data or some performance visualization tool that uses ASH to determine what plan step is taking the most amount of time, and diagnose and address the issue surgically by modified or hinting your query. This is a big topic and not something that can be explained further in this forum. You may need to ask your DBA for help with this. -
If the time is all on the insert load step itself, check to see how many indexes there are. If there are a lot of indexes, you may want to disable them for the load, then rebuild them when the load is complete.
-
Make sure you are using parallelism (assuming you have enough CPUs and you've asked your DBA what degree of parallelism is appropriate for your system), and for the insert piece, that you have enabled parallel DML:
alter session enable parallel dml; insert /*+ append parallel(8) */ into tbl SELECT ...
-
Make sure your process isn't actually being blocked and doing nothing. Again,
v$active_session_history
will tell you the predominate wait event and if it's a concurrency class wait (enqueue, pin, suspended due to tablespace full, etc..) then you'll need to resolve the blocking issue.
I'm sure there's a simple solution. I see INSERT SELECT
s of 25 million rows all the time that take only a dozen seconds or so. That's really not a very large number in the Oracle world.
答案2
得分: 0
这是如何并行批处理处理某些内容的指示:https://oracle-base.com/articles/11g/dbms_parallel_execute_11gR2
https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_PARALLEL_EXECUTE.html
英文:
Here's an indication of how to process something in batches in parallel: https://oracle-base.com/articles/11g/dbms_parallel_execute_11gR2
https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_PARALLEL_EXECUTE.html
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论