英文:
Updating via cursor is taking a lot of time
问题
我正在执行地址关联操作,通过获取addr_id并将其更新到两个表格中(相同表格和另一个表格),分别用于不同的列。然而,即使我使用了并行提示并将并行度设置为500,它仍然需要很长时间。你能在这里提供帮助吗?以下是该存储过程。
PROCEDURE
ADD_ASOC AS
BEGIN
DECLARE
CURSOR C1 IS
SELECT /*+ PARALLEL(500) */ ADDR_ID FROM TEMP_ADDR WHERE BATCH_RANGE BETWEEN 100 AND 900;
TYPE CURSOR_ATT IS TABLE OF C1%ROWTYPE;
L_CURSOR CURSOR_ATT;
BEGIN
OPEN C1;
LOOP
FETCH C1 BULK COLLECT INTO L_CURSOR LIMIT 5000;
EXIT WHEN L_CURSOR.COUNT = 0;
FOR I IN 1 .. L_CURSOR.COUNT LOOP
UPDATE /*+ PARALLEL(500) */ MAIN_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID;
UPDATE /*+ PARALLEL(500) */ TEMP_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;
COMMIT;
END LOOP;
END LOOP;
COMMIT;
CLOSE C1;
END;
COMMIT;
我尝试使用并行度设置为500来执行,但没有成功。
英文:
I'm performing address association by taking the addr_id and updating it in two tables (same table and another table) for different columns. However it is taking huge time even though I use parallel hint with degree of paralleism as 500. Could you assist here ? Below is the procedure.
PROCEDURE
ADD_ASOC AS
BEGIN
DECLARE
CURSOR C1 IS
SELECT /*+ PARALLEL(500) */ ADDR_ID FROM TEMP_ADDR WHERE BATCH_RANGE BETWEEN 100 AND 900;
TYPE CURSOR_ATT IS TABLE OF C1%ROWTYPE;
L_CURSOR CURSOR_ATT;
BEGIN
OPEN C1;
LOOP
FETCH C1 BULK COLLECT INTO L_CURSOR LIMIT 5000;
EXIT WHEN L_CURSOR.COUNT = 0;
FOR I IN 1 .. L_CURSOR.COUNT LOOP
UPDATE /*+ PARALLEL(500) */ MAIN_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID;
UPDATE /*+ PARALLEL(500) */ TEMP_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;
COMMIT;
END LOOP;
END LOOP;
COMMIT;
CLOSE C1;
END;
COMMIT;
I executed using degree of paralleism as 500 but in vain.
答案1
得分: 3
你似乎不需要游标或循环,可以使用MERGE
语句和UPDATE
语句。
PROCEDURE ADD_ASOC
AS
BEGIN
MERGE INTO MAIN_ADDR m
USING TEMP_ADDR t
ON ( m.addr_id = t.addr_id AND t.BATCH_RANGE BETWEEN 100 AND 900)
WHEN MATCHED THEN;
UPDATE
SET ADR_NAME = t.ADDR_ID||'RESIDENCE';
UPDATE TEMP_ADDR
SET ADR_NAME = ADDR_ID||'RESIDENCE'
WHERE BATCH_RANGE BETWEEN 100 AND 900;
END;
注意:如果在存储过程中使用COMMIT
,则无法使用多个存储过程,如果最后一个存储过程失败,则不能选择将它们全部回滚。相反,您应该从调用存储过程的会话中删除COMMIT
,然后您将能够链接多个存储过程在一起。(此外,反复使用COMMIT
非常慢。)
英文:
You do not appear to need cursors or loops and can use a MERGE
statement and an UPDATE
statement
PROCEDURE ADD_ASOC
AS
BEGIN
MERGE INTO MAIN_ADDR m
USING TEMP_ADDR t
ON ( m.addr_id = t.addr_id AND t.BATCH_RANGE BETWEEN 100 AND 900)
WHEN MATCHED THEN;
UPDATE
SET ADR_NAME = t.ADDR_ID||'RESIDENCE';
UPDATE TEMP_ADDR
SET ADR_NAME = ADDR_ID||'RESIDENCE'
WHERE BATCH_RANGE BETWEEN 100 AND 900;
END;
Note: If you COMMIT
in a procedure then you cannot use multiple procedures and then if the last one fails choose to ROLLBACK
them all. Instead you should remove the COMMIT
from the procedures and call COMMIT
from the session where you invoke the procedure then you will be able to chain multiple procedures together. (Plus, repeatedly using COMMIT
is very slow.)
答案2
得分: 0
- 在单行
update
语句中使用parallel
提示最多也只是毫无意义的。 commit
非常昂贵。在循环的每次迭代中都执行commit
将会极其缓慢。- 执行
forall
比使用for
循环更有效率。在单个SQL语句中执行更新将是最有效的方法。 - 你是如何确定将度量并行性设置为500对于你的
select
语句的?这似乎极其、极其不可能是一个合理的数字。
除此之外,你对追踪代码或查看哪些部分花费了时间做了什么呢?我们当然可以猜测可能会慢的部分,但我们无法查看例如你的表是否缺少addr_id
索引,或者是否存在一堆未索引的外键需要管理,或者是否存在减慢速度的触发器等情况。
forall
方法会更快一些(不如单独的update
语句快,但比在循环中逐行处理要好)。
FORALL I IN 1 .. L_CURSOR.COUNT
UPDATE MAIN_ADDR
SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID;
FORALL I IN 1 .. L_CURSOR.COUNT
UPDATE TEMP_ADDR
SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;
UPDATE /*+ PARALLEL(500) */ TEMP_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;
英文:
- A
parallel
hint on a single-rowupdate
statement is, at best, pointless. commit
is very expensive. Acommit
on every single iteration of a loop would be catastrophically slow.- Doing a
forall
is more efficient than afor
loop. Doing the update in a single SQL statement would be the most efficient approach. - How did you come up with 500 as the degrees of parallelism for your
select
statement? It seems highly, highly unlikely that is a reasonable number.
Beyond that, what have you done to trace your code or to see what's taking time. We can certainly guess at things that might be slow but we have no way to see if, for example, one of your tables is missing an index on addr_id
or if there are a bunch of unindexed foreign keys that need to be managed or if there are triggers that are slowing things down.
The forall
approach that would be faster (not as fast as individual update
statements but better than row-by-row processing in a loop).
FORALL I IN 1 .. L_CURSOR.COUNT
UPDATE MAIN_ADDR
SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID;
FORALL I IN 1 .. L_CURSOR.COUNT
UPDATE TEMP_ADDR
SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;
UPDATE /*+ PARALLEL(500) */ TEMP_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论