在SQL中通过SELECT语句插入大量数据。

huangapple go评论64阅读模式
英文:

insert huge data via select statement in SQL

问题

我正在通过选择语句将数据插入到一个表中,但是那个选择语句有大约2500万条数据,所以花费了5-6个小时的时间并且失败了。所以我们能分批次插入相同的数据吗?

请提出建议。

我不能更改查询,我需要插入整个数据。

英文:

I am inserting the data into one table via select statement , But that select statement has huge data like 25milion , so its taking long time 5-6 hours and failing , So Can we insert the same data like batch wise ?

Kindly suggest.

I can not change the query, I need to insert entire data.

答案1

得分: 1

  1. 首先,您必须确定它花费了多少时间。如果您正在进行任何表连接,它很可能是在“SELECT”部分。使用“v$active_session_history”数据或某个使用ASH的性能可视化工具来确定哪个计划步骤花费了最多的时间,然后通过修改或暗示您的查询来诊断和解决问题。这是一个大课题,不是在这个论坛中可以进一步解释的事情。您可能需要向您的数据库管理员寻求帮助。

  2. 如果时间全部用于插入加载步骤本身,请检查索引的数量。如果有很多索引,您可能希望在加载时禁用它们,然后在加载完成后重建它们。

  3. 确保您正在使用并行处理(假设您有足够的CPU,并且已经询问了您的DBA哪种并行度适合您的系统),并且对于插入部分,您已经启用了并行DML:

alter session enable parallel dml;
insert /*+ append parallel(8) */ into tbl SELECT ...
  1. 确保您的进程实际上没有被阻塞并且没有在做无用的事情。再次,"v$active_session_history"将告诉您主要的等待事件,如果它是一个并发类等待(enqueue、pin、由于表空间已满而挂起等等),那么您需要解决阻塞问题。

我相信有一个简单的解决方案。我经常看到Oracle世界中的2500万行的INSERT SELECT只需大约十几秒的时间。在Oracle世界中,这实际上不是一个非常大的数字。

英文:

Tuning is a huge subject, and requires a lot of data and knowledge of your environment, so we can only give guidelines here.

  1. You must first determine where it is spending it's time. It could very well be in the SELECT portion, if you are doing any table joins. Use v$active_session_history data or some performance visualization tool that uses ASH to determine what plan step is taking the most amount of time, and diagnose and address the issue surgically by modified or hinting your query. This is a big topic and not something that can be explained further in this forum. You may need to ask your DBA for help with this.

  2. If the time is all on the insert load step itself, check to see how many indexes there are. If there are a lot of indexes, you may want to disable them for the load, then rebuild them when the load is complete.

  3. Make sure you are using parallelism (assuming you have enough CPUs and you've asked your DBA what degree of parallelism is appropriate for your system), and for the insert piece, that you have enabled parallel DML:

    alter session enable parallel dml;
    insert /*+ append parallel(8) */ into tbl SELECT ...
    
  4. Make sure your process isn't actually being blocked and doing nothing. Again, v$active_session_history will tell you the predominate wait event and if it's a concurrency class wait (enqueue, pin, suspended due to tablespace full, etc..) then you'll need to resolve the blocking issue.

I'm sure there's a simple solution. I see INSERT SELECTs of 25 million rows all the time that take only a dozen seconds or so. That's really not a very large number in the Oracle world.

答案2

得分: 0

这是如何并行批处理处理某些内容的指示:https://oracle-base.com/articles/11g/dbms_parallel_execute_11gR2

https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_PARALLEL_EXECUTE.html

英文:

Here's an indication of how to process something in batches in parallel: https://oracle-base.com/articles/11g/dbms_parallel_execute_11gR2

https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_PARALLEL_EXECUTE.html

huangapple
  • 本文由 发表于 2023年5月22日 17:49:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76304935.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定