英文:
Bulk Update - Oracle Sequence
问题
我有一个包含1.5亿条记录的表格,我想要从一个序列更新序列列。
更新的最快方式是什么?
我已经使用了并行,但花了好几个小时还没有结束。
UPDATE /+ parallel (c, 50)/ rpm_future_retail_tmp c
SET future_retail_id = rpm_future_retail_seq.NEXTVAL;
英文:
I have a table with 150 million records
I want to update the sequence column from a sequence.
What is the fastest way to update?
I have used parallel, but it is taking hours and not ending
UPDATE /+ parallel (c, 50)/ rpm_future_retail_tmp c
SET future_retail_id = rpm_future_retail_seq.NEXTVAL;
what is the faster way?
答案1
得分: 1
创建新表通常比应用 DML 更快,尤其比 UPDATE 语句更快。您可以使用以下作为替代方法:
CREATE TABLE rpm_future_retail_tmp_ parallel 8 nologging AS
SELECT rpm_future_retail_seq.NEXTVAL AS future_retail_id,
<以及除 future_retail_id 之外的逗号分隔列>
FROM rpm_future_retail_tmp;
DROP TABLE rpm_future_retail_tmp;
ALTER TABLE rpm_future_retail_tmp_ RENAME TO rpm_future_retail_tmp;
其中:
- 并行度的程度可能根据您的 DBMS 源而异。
- 用于以后重现表(
rpm_future_retail_tmp
)的权限授予和索引的语句应该在删除表之前保存到一个地方。
英文:
Mostly creating a new table is faster than applying a DML, especially faster than an UPDATE statement. You can use as an alternative :
CREATE TABLE rpm_future_retail_tmp_ parallel 8 nologging AS
SELECT rpm_future_retail_seq.NEXTVAL AS future_retail_id,
<and the comma-separated columns other than future_retail_id>
FROM rpm_future_retail_tmp;
DROP TABLE rpm_future_retail_tmp;
ALTER TABLE rpm_future_retail_tmp_ RENAME TO rpm_future_retail_tmp;
where
- the degree of parallelism might vary depending on your DBMS's source
- the statements to reproduce later the privileges(grants) and indexes should
be saved to a place for the table(rpm_future_retail_tmp
) before
dropping it
答案2
得分: 1
如果序列从 1 开始,那么您可能还要考虑使用 ROWNUM
值来更新列。
这是我的 21cXE 数据库,运行在 MS Windows 10 上,Intel i5,8GB RAM。我有一个大约有100万行的表(不想创建一个比这大150倍的表):
SQL> select count(*) from rpm;
COUNT(*)
----------
1033616
Elapsed: 00:00:00.02
使用序列更新它需要约12秒:
SQL> update rpm set id = seq.nextval;
1033616 行已更新。
Elapsed: 00:00:12.98
SQL> update rpm set id = seq.nextval;
1033616 行已更新。
Elapsed: 00:00:12.39
SQL> update rpm set id = seq.nextval;
1033616 行已更新。
Elapsed: 00:00:10.56
让我们尝试 rownum
;它需要更少的时间(平均 3 次运行约为4秒):
SQL> update rpm set id = rownum;
1033616 行已更新。
Elapsed: 00:00:07.51
SQL> update rpm set id = rownum;
1033616 行已更新。
Elapsed: 00:00:02.89
SQL> update rpm set id = rownum;
1033616 行已更新。
Elapsed: 00:00:02.87
SQL>
我理解您的系统与我的系统不同,并且时间 取决于 各种因素,但我想尝试另一种方法应该不会有害。
对于将来通过数据库触发器插入到 ID 列中,只需(重新)创建序列:
SQL> select max(id) from rpm;
MAX(ID)
----------
1033616
SQL> drop sequence seq;
序列已删除。
SQL> create sequence seq start with 1033617;
序列已创建。
SQL>
英文:
If sequence starts from 1, then you might also consider updating column with a ROWNUM
value instead.
This is my 21cXE database, running on MS Windows 10, Intel i5, 8GB RAM. I have a table with ~1 million rows (don't feel like creating one 150 times larger):
SQL> select count(*) from rpm;
COUNT(*)
----------
1033616
Elapsed: 00:00:00.02
Updating it with a sequence takes ~12 seconds:
SQL> update rpm set id = seq.nextval;
1033616 rows updated.
Elapsed: 00:00:12.98
SQL> update rpm set id = seq.nextval;
1033616 rows updated.
Elapsed: 00:00:12.39
SQL> update rpm set id = seq.nextval;
1033616 rows updated.
Elapsed: 00:00:10.56
Let's try rownum
; it takes less time (average of 3 runs is ~4 seconds):
SQL> update rpm set id = rownum;
1033616 rows updated.
Elapsed: 00:00:07.51
SQL> update rpm set id = rownum;
1033616 rows updated.
Elapsed: 00:00:02.89
SQL> update rpm set id = rownum;
1033616 rows updated.
Elapsed: 00:00:02.87
SQL>
I understand that your system is different from mine and timings depend on various things, but I guess it can't harm if you try another approach.
For future inserts into the ID column (via database trigger?), just (re)create the sequence:
SQL> select max(id) from rpm;
MAX(ID)
----------
1033616
SQL> drop sequence seq;
Sequence dropped.
SQL> create sequence seq start with 1033617;
Sequence created.
SQL>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论