英文:
Optimising ,Why SQL query is running very slow in Oracle Database 19
问题
To optimize your query for better performance, you can consider the following steps:
-
Indexing: Ensure that the columns used in your join conditions and filtering criteria have appropriate indexes. In particular, consider indexing the columns used in these conditions:
wfi.wf_instance_id = wft.wf_instance_id
t.response = t11.SHORTNAME
-
Subquery Optimization: The subquery
(SELECT t.task_id, ...)
appears to be complex. You can optimize it by breaking it down into smaller subqueries or views and indexing the necessary columns. -
Avoid Functions in WHERE Clause: Avoid using functions like
CAST
,TRIM
, andREGEXP_SUBSTR
in your WHERE clause. These can slow down query performance, especially on large datasets. Consider preprocessing the data or using computed columns if possible. -
Table Partitioning: If your table is very large, consider partitioning it based on a relevant column, which can significantly improve query performance.
-
Analyze Query Plan: Regularly analyze the query execution plan to identify bottlenecks. Look for table scans and inefficient joins.
-
Tuning Hardware: Sometimes, improving hardware resources such as CPU, memory, and disk speed can also help with query performance.
Remember to test the performance improvements in a controlled environment before implementing them in a production database to ensure they have the desired effect without causing any issues.
英文:
Wanted to optimize a query with the minus that it takes too much time ... if they can give thanked help.
I have a query that runs against a pretty large table,the query takes forever and presumably does a full table scan.This query is very very slow!
select
transaction_id as APPL_ID
,cast(reason_desc as VARCHAR2(2000)) as APPL_REJECT_REASON_DESC
,cast(reason_code as VARCHAR2(2000)) as APPL_REJECT_REASON
,cast(user_reject as VARCHAR2(100)) as APPL_REJECT_USER
,cast(step_reject as VARCHAR2(100)) as APPL_REJECT_USER_ROLE
,cast(reason_desc_vn as VARCHAR2(2000)) as APPL_REJECT_REASON_DESC_VN
from(
select wfi.item_id transaction_id
,cq1.name as reason_desc
,cq1.response as reason_code
,COALESCE(wft.executed_by, wft.recipient_shortname) as user_reject
,wft.profile_access_right_sname as step_reject
,row_number() over (partition by wfi.item_id order by wft.start_date desc) stt
,cq1.name_1 as reason_desc_vn
--SELECT 1
from STA.STA_ACL_WF_INSTANCE wfi
inner join STA.STA_ACL_WF_TASK wft
on (wfi.item = 'transaction' and wfi.wf_instance_id = wft.wf_instance_id)
inner join (
SELECT t.task_id
,LISTAGG (t.response, ', ') WITHIN GROUP (ORDER BY t.response) as response
,LISTAGG (t11.name, ', ') WITHIN GROUP (ORDER BY t11.name) as name
,LISTAGG (t11.name_1, ', ') WITHIN GROUP (ORDER BY t11.name_1) as name_1
FROM (
SELECT task_id,
CAST(TRIM(regexp_substr(t.RESPONSE, '[^,]+', 1, levels.column_value)) AS VARCHAR2(100)) AS RESPONSE
FROM sta.sta_acl_wf_task_col_quest t,
table(cast(multiset(select level from dual connect by level <= length (
regexp_replace(t.RESPONSE, '[^,]+')) + 1) as sys.OdciNumberList)) levels
WHERE t.question in ('Reject Reason','Cancel Reason','Reject Reason For Recommendation','Reject Reason For Approval')
AND t.response is not null
) t
LEFT JOIN STA.STA_ACL_STATIC_DATA_TABLE t11 on (t.response= t11.SHORTNAME)
WHERE 1=1
GROUP BY t.task_id
)cq1 on wft.task_id = cq1.task_id
) t8 where stt=1
Explain SQL
PLAN_TABLE_OUTPUT |
----------------------------------------------------------------------------------------------------------------------------------+
Plan hash value: 3937279373 |
|
----------------------------------------------------------------------------------------------------------------------------- |
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | |
----------------------------------------------------------------------------------------------------------------------------- |
| 0 | SELECT STATEMENT | | 9560M| 53T| | 3339M (2)| 72:27:52 | |
|* 1 | VIEW | | 9560M| 53T| | 3339M (2)| 72:27:52 | |
|* 2 | WINDOW NOSORT | | 9560M| 8431G| | 3339M (2)| 72:27:52 | |
| 3 | SORT GROUP BY | | 9560M| 8431G| 8580G| 3339M (2)| 72:27:52 | |
|* 4 | HASH JOIN RIGHT OUTER | | 9560M| 8431G| 3392K| 650M (1)| 14:07:31 | |
| 5 | TABLE ACCESS FULL | STA_ACL_STATIC_DATA_TABLE | 19812 | 3153K| | 399 (3)| 00:00:01 | |
| 6 | NESTED LOOPS | | 7985M| 5830G| | 40M (4)| 00:52:07 | |
|* 7 | HASH JOIN | | 488K| 364M| 446M| 958K (3)| 00:01:15 | |
|* 8 | TABLE ACCESS FULL | STA_ACL_WF_INSTANCE | 1592K| 428M| | 77403 (4)| 00:00:07 | |
|* 9 | HASH JOIN | | 986K| 470M| 57M| 787K (3)| 00:01:02 | |
|* 10 | TABLE ACCESS FULL | STA_ACL_WF_TASK_COL_QUEST | 986K| 46M| | 396K (5)| 00:00:32 | |
| 11 | TABLE ACCESS FULL | STA_ACL_WF_TASK | 5496K| 2364M| | 139K (3)| 00:00:11 | |
| 12 | COLLECTION ITERATOR SUBQUERY FETCH| | 16360 | 32720 | | 80 (4)| 00:00:01 | |
|* 13 | CONNECT BY WITHOUT FILTERING | | | | | | | |
| 14 | FAST DUAL | | 1 | | | 3 (0)| 00:00:01 | |
----------------------------------------------------------------------------------------------------------------------------- |
|
Predicate Information (identified by operation id): |
--------------------------------------------------- |
|
1 - filter("STT"=1) |
2 - filter(ROW_NUMBER() OVER ( PARTITION BY "WFI"."ITEM_ID" ORDER BY INTERNAL_FUNCTION("WFT"."START_DATE") DESC |
)<=1) |
4 - access("T11"."SHORTNAME"(+)=SYS_OP_C2C(CAST(TRIM( REGEXP_SUBSTR ("T"."RESPONSE" /*+ LOB_BY_VALUE */ |
,'[^,]+',1,VALUE(KOKBF$))) AS VARCHAR2(100)))) |
7 - access("WFI"."WF_INSTANCE_ID"="WFT"."WF_INSTANCE_ID") |
8 - filter("WFI"."ITEM"=U'transaction') |
9 - access("WFT"."TASK_ID"="TASK_ID") |
10 - filter(("T"."QUESTION"=U'Cancel Reason' OR "T"."QUESTION"=U'Reject Reason' OR "T"."QUESTION"=U'Reject Reason |
For Approval' OR "T"."QUESTION"=U'Reject Reason For Recommendation') AND "T"."RESPONSE" /*+ LOB_BY_VALUE */ IS NOT |
NULL) |
13 - filter(LEVEL<=LENGTH( REGEXP_REPLACE (:B1,'[^,]+'))+1) |
Would you please guide me on how I can change the query to get better performance? Any help would be appreciated.
答案1
得分: 1
正则表达式较慢,层次查询可能比递归查询更慢。
您可以使用简单的字符串函数拆分字符串(需要键入更多内容但更有效):
WITH responses (task_id, response, spos, epos) AS (
SELECT task_id,
response,
1,
INSTR(response, ',', 1)
FROM sta.sta_acl_wf_task_col_quest
WHERE question in (
'Reject Reason',
'Cancel Reason',
'Reject Reason For Recommendation',
'Reject Reason For Approval'
)
AND response is not null
UNION ALL
SELECT task_id,
response,
epos + 1,
INSTR(response, ',', epos + 1)
FROM responses
WHERE epos > 0
)
SELECT task_id,
CASE epos
WHEN 0
THEN SUBSTR(response, spos)
ELSE SUBSTR(response, spos, epos - spos)
END AS response
FROM responses
但更好的做法是不要使用逗号分隔的变量存储值,而是使用表的单独行。
英文:
Regular expressions are slow and hierarchical queries may be slower than recursive queries.
You can split strings using simple string functions (which is more to type but more efficient):
WITH responses (task_id, response, spos, epos) AS (
SELECT task_id,
response,
1,
INSTR(response, ',', 1)
FROM sta.sta_acl_wf_task_col_quest
WHERE question in (
'Reject Reason',
'Cancel Reason',
'Reject Reason For Recommendation',
'Reject Reason For Approval'
)
AND response is not null
UNION ALL
SELECT task_id,
response,
epos + 1,
INSTR(response, ',', epos + 1)
FROM responses
WHERE epos > 0
)
SELECT task_id,
CASE epos
WHEN 0
THEN SUBSTR(response, spos)
ELSE SUBSTR(response, spos, epos - spos)
END AS response
FROM responses
However, it would be even better if you did not store values using comma-separated variables and instead use separate rows of a table.
答案2
得分: 0
尝试删除不必要的TABLE(CAST(MULTISET
。简化的表达式更有可能导致更好的基数估算,从而更有可能导致更好的执行计划。
例如,以下是原始子查询的解释计划。 (尽管我不得不添加文字来使代码在我的系统上运行。)查询返回4行,但Oracle认为它将返回8168行。这是因为Oracle放弃了尝试估算不可预测的表函数,并返回接近块大小的数字。这可能解释了您的解释计划中的16360 - 我猜测您的系统的块大小是16K?
explain plan for
select * from
table(cast(multiset(select level from dual connect by level <= length (
regexp_replace(/*It.RESPONSE*/ 'A,B,C,D', '[^,]+')) + 1) as sys.OdciNumberList)) levels
WHERE /*t.question*/ 'Cancel Reason' in ('Reject Reason','Cancel Reason','Reject Reason For Recommendation','Reject Reason For Approval');
select * from table(dbms_xplan.display);
Plan hash value: 3985296316
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR SUBQUERY FETCH| | 8168 | 16336 | 29 (0)| 00:00:01 |
|* 2 | CONNECT BY WITHOUT FILTERING | | | | | |
| 3 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(LEVEL<=2)
删除TABLE(CAST(MULTISET
有助于Oracle更好地估算1行,这相当接近实际的4行。(但如果您使用MTO的建议避免存储逗号分隔的列表,您可以简化代码并进一步改进基数估算。)
explain plan for
select * from
(select level from dual connect by level <= length (
regexp_replace(/*It.RESPONSE*/ 'A,B,C,D', '[^,]+')) + 1) levels
WHERE /*t.question*/ 'Cancel Reason' in ('Reject Reason','Cancel Reason','Reject Reason For Recommendation','Reject Reason For Approval');
select * from table(dbms_xplan.display);
Plan hash value: 2403765415
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 |
| 1 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 |
|* 2 | CONNECT BY WITHOUT FILTERING| | | | | |
| 3 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(LEVEL<=4)
但这只是一个猜测!我们正在处理解释计划的猜测,而不是执行计划的实际值。简化查询以接近实际值几乎总是有帮助的,但有时候这反而会使情况变得更糟。
如果您真的想优化您的查询,请尝试以下步骤并修改您的帖子以获得额外的反馈:
- 查询需要多长时间?您希望它需要多长时间?
- 您的表有多大(使用DBA_SEGMENTS.BYTES),每个表返回的行的百分比是多少?不要太担心完整表扫描。索引更适合从表中检索少量行,但完整表扫描更适合从表中检索大量行。如果查询的一部分必然返回大表的50%,那么限制因素将是您的硬件读取该表的X千兆字节数据的速度。
- 获取实际数字而不是猜测。通过查询GV$SQL来找到您的查询的SQL_ID,然后运行
select dbms_sqltune.report_sql_monitor('<sql_id>') from dual;
。结果将提供每个操作的实际时间和实际行数。它将告诉您哪些操作需要关注,并提供有关Oracle为什么选择了不良计划的线索。解释计划是一个不错的开始,但与实际值相比,它们相形见绌。这一步可能需要您几个小时来理解。
英文:
Try removing the unnecessary TABLE(CAST(MULTISET
. The simpler expression is more likely to lead to a better cardinality estimate, which is more likely to lead to a better execution plan.
For example, below is the explain plan for the original subquery. (Although I had to add literals to make the code work on my system.) The query returns 4 rows but Oracle thinks it will return 8168. This is because Oracle just gives up on trying to estimate unpredictable table functions and returns a number close to the block size. This might explain your explain plan's 16360 - I'm guessing your system has a 16K block size?
explain plan for
select * from
table(cast(multiset(select level from dual connect by level <= length (
regexp_replace(/*It.RESPONSE*/ 'A,B,C,D', '[^,]+')) + 1) as sys.OdciNumberList)) levels
WHERE /*t.question*/ 'Cancel Reason' in ('Reject Reason','Cancel Reason','Reject Reason For Recommendation','Reject Reason For Approval');
select * from table(dbms_xplan.display);
Plan hash value: 3985296316
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR SUBQUERY FETCH| | 8168 | 16336 | 29 (0)| 00:00:01 |
|* 2 | CONNECT BY WITHOUT FILTERING | | | | | |
| 3 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(LEVEL<=2)
Removing the TABLE(CAST(MULTISET
helps Oracle make a much better estimate of 1 row, which is pretty close to the actual 4. (But if you use MTO's suggestion to avoid storing comma-separated lists, you could simplify your code and further improve cardinality estimates.)
explain plan for
select * from
(select level from dual connect by level <= length (
regexp_replace(/*It.RESPONSE*/ 'A,B,C,D', '[^,]+')) + 1) levels
WHERE /*t.question*/ 'Cancel Reason' in ('Reject Reason','Cancel Reason','Reject Reason For Recommendation','Reject Reason For Approval');
select * from table(dbms_xplan.display);
Plan hash value: 2403765415
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 |
| 1 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 |
|* 2 | CONNECT BY WITHOUT FILTERING| | | | | |
| 3 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(LEVEL<=4)
But this is just a guess! We're dealing with explain plan guesses instead of execution plan actual values. Simplifying queries to get closer to actual values is almost always helpful, but sometimes it can paradoxically make things worse.
If you really want to optimize your query, try these steps and modify your post to get additional feedback:
- How long does the query take? How long do you expect it to take?
- How large are your tables (using DBA_SEGMENTS.BYTES), and what percentage of rows are returned from each table? Don't worry too much about full table scans. Indexes are better for retrieving a small percentage of rows from a table, but full table scans are better for retrieving a large percentage of rows from a table. If part of your query must inevitably return 50% of a large table, then the limiting factor will be how fast can your hardware read the X gigabytes of data for that table.
- Get actual numbers instead of guesses. Find the SQL_ID of your query by querying GV$SQL, and then run
select dbms_sqltune.report_sql_monitor('<sql_id>') from dual;
. The results will give you the actual amount of time for each operation and the actual number of rows. It will tell you which operations to worry about, and give you clues to why Oracle chose a bad plan. Explain plans are a decent start, but they are pathetic compared to the actual values. This step may take you hours to understand.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论