通过游标更新需要很长时间。

huangapple go评论59阅读模式
英文:

Updating via cursor is taking a lot of time

问题

我正在执行地址关联操作,通过获取addr_id并将其更新到两个表格中(相同表格和另一个表格),分别用于不同的列。然而,即使我使用了并行提示并将并行度设置为500,它仍然需要很长时间。你能在这里提供帮助吗?以下是该存储过程。

PROCEDURE 
ADD_ASOC AS
BEGIN
DECLARE
CURSOR C1 IS
SELECT /*+ PARALLEL(500) */ ADDR_ID FROM TEMP_ADDR WHERE BATCH_RANGE BETWEEN 100 AND 900;

TYPE CURSOR_ATT IS TABLE OF C1%ROWTYPE;

L_CURSOR CURSOR_ATT;

    BEGIN
    OPEN C1;
    LOOP

    FETCH C1 BULK COLLECT INTO          L_CURSOR LIMIT 5000;
EXIT WHEN L_CURSOR.COUNT = 0;

FOR I IN 1 .. L_CURSOR.COUNT LOOP

UPDATE  /*+ PARALLEL(500) */ MAIN_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID;



UPDATE  /*+ PARALLEL(500) */ TEMP_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'

WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;

COMMIT;
END LOOP;
END LOOP;
COMMIT;
CLOSE C1;
END;
COMMIT;

我尝试使用并行度设置为500来执行,但没有成功。

英文:

I'm performing address association by taking the addr_id and updating it in two tables (same table and another table) for different columns. However it is taking huge time even though I use parallel hint with degree of paralleism as 500. Could you assist here ? Below is the procedure.

PROCEDURE 
ADD_ASOC AS
BEGIN
DECLARE
CURSOR C1 IS
SELECT /*+ PARALLEL(500) */ ADDR_ID FROM TEMP_ADDR WHERE BATCH_RANGE BETWEEN 100 AND 900;

TYPE CURSOR_ATT IS TABLE OF C1%ROWTYPE;

L_CURSOR CURSOR_ATT;

    BEGIN
    OPEN C1;
    LOOP

    FETCH C1 BULK COLLECT INTO          L_CURSOR LIMIT 5000;
EXIT WHEN L_CURSOR.COUNT = 0;

FOR I IN 1 .. L_CURSOR.COUNT LOOP

UPDATE  /*+ PARALLEL(500) */ MAIN_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID;



UPDATE  /*+ PARALLEL(500) */ TEMP_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'

WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;

COMMIT;
END LOOP;
END LOOP;
COMMIT;
CLOSE C1;
END;
COMMIT;

I executed using degree of paralleism as 500 but in vain.

答案1

得分: 3

你似乎不需要游标或循环,可以使用MERGE语句和UPDATE语句。

PROCEDURE ADD_ASOC
AS
BEGIN
  MERGE INTO MAIN_ADDR m
  USING TEMP_ADDR t
  ON ( m.addr_id = t.addr_id AND t.BATCH_RANGE BETWEEN 100 AND 900)
  WHEN MATCHED THEN;
    UPDATE
    SET ADR_NAME = t.ADDR_ID||'RESIDENCE';


  UPDATE TEMP_ADDR
  SET ADR_NAME = ADDR_ID||'RESIDENCE'
  WHERE BATCH_RANGE BETWEEN 100 AND 900;
END;

注意:如果在存储过程中使用COMMIT,则无法使用多个存储过程,如果最后一个存储过程失败,则不能选择将它们全部回滚。相反,您应该从调用存储过程的会话中删除COMMIT,然后您将能够链接多个存储过程在一起。(此外,反复使用COMMIT非常慢。)

英文:

You do not appear to need cursors or loops and can use a MERGE statement and an UPDATE statement

PROCEDURE ADD_ASOC
AS
BEGIN
  MERGE INTO MAIN_ADDR m
  USING TEMP_ADDR t
  ON ( m.addr_id = t.addr_id AND t.BATCH_RANGE BETWEEN 100 AND 900)
  WHEN MATCHED THEN;
    UPDATE
    SET ADR_NAME = t.ADDR_ID||'RESIDENCE';


  UPDATE TEMP_ADDR
  SET ADR_NAME = ADDR_ID||'RESIDENCE'
  WHERE BATCH_RANGE BETWEEN 100 AND 900;
END;

Note: If you COMMIT in a procedure then you cannot use multiple procedures and then if the last one fails choose to ROLLBACK them all. Instead you should remove the COMMIT from the procedures and call COMMIT from the session where you invoke the procedure then you will be able to chain multiple procedures together. (Plus, repeatedly using COMMIT is very slow.)

答案2

得分: 0

  • 在单行update语句中使用parallel提示最多也只是毫无意义的。
  • commit非常昂贵。在循环的每次迭代中都执行commit将会极其缓慢。
  • 执行forall比使用for循环更有效率。在单个SQL语句中执行更新将是最有效的方法。
  • 你是如何确定将度量并行性设置为500对于你的select语句的?这似乎极其、极其不可能是一个合理的数字。

除此之外,你对追踪代码或查看哪些部分花费了时间做了什么呢?我们当然可以猜测可能会慢的部分,但我们无法查看例如你的表是否缺少addr_id索引,或者是否存在一堆未索引的外键需要管理,或者是否存在减慢速度的触发器等情况。

forall方法会更快一些(不如单独的update语句快,但比在循环中逐行处理要好)。

FORALL I IN 1 .. L_CURSOR.COUNT
  UPDATE MAIN_ADDR 
     SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
   WHERE ADDR_ID = L_CURSOR(I).ADDR_ID;

FORALL I IN 1 .. L_CURSOR.COUNT
  UPDATE TEMP_ADDR 
     SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
   WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
     AND BATCH_RANGE BETWEEN 100 AND 900;
UPDATE  /*+ PARALLEL(500) */ TEMP_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;
英文:
  • A parallel hint on a single-row update statement is, at best, pointless.
  • commit is very expensive. A commit on every single iteration of a loop would be catastrophically slow.
  • Doing a forall is more efficient than a for loop. Doing the update in a single SQL statement would be the most efficient approach.
  • How did you come up with 500 as the degrees of parallelism for your select statement? It seems highly, highly unlikely that is a reasonable number.

Beyond that, what have you done to trace your code or to see what's taking time. We can certainly guess at things that might be slow but we have no way to see if, for example, one of your tables is missing an index on addr_id or if there are a bunch of unindexed foreign keys that need to be managed or if there are triggers that are slowing things down.

The forall approach that would be faster (not as fast as individual update statements but better than row-by-row processing in a loop).

FORALL I IN 1 .. L_CURSOR.COUNT
  UPDATE MAIN_ADDR 
     SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
   WHERE ADDR_ID = L_CURSOR(I).ADDR_ID;

FORALL I IN 1 .. L_CURSOR.COUNT
  UPDATE TEMP_ADDR 
     SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'
   WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
     AND BATCH_RANGE BETWEEN 100 AND 900;

UPDATE /*+ PARALLEL(500) */ TEMP_ADDR SET ADR_NAME = L_CURSOR(I).ADDR_ID||'RESIDENCE'

WHERE ADDR_ID = L_CURSOR(I).ADDR_ID
AND BATCH_RANGE BETWEEN 100 AND 900;

huangapple
  • 本文由 发表于 2023年7月14日 02:41:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76682380.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定