英文:
Oracle query optimizer chooses bad plan at times with same query
问题
我有一个Jetty应用程序连接到Oracle数据库以持久化应用程序元数据。
应用程序触发的SQL查询如下:
select * from my_table
where id = 'some_id' --> 不是唯一的,可能会有重复的Id
and created > sysdate - 160/24
and started <= to_timestamp('07-06-2023 06:12:45', 'dd-mm-yyyy hh24:mi:ss')
and status in (3, 13, 14);
所有表列都有索引,除了"started"列。
我注意到有时这个查询响应时间很长,大约需要5分钟。通常情况下,它会在几秒内完成。
经过进一步的深入研究,我们发现当查询响应时间较长时,优化器选择了不良的执行计划。在这种情况下,它是基于"Id"列进行筛选的。但当通过"status"进行扫描时,它运行良好。
我尝试将这个表每7天进行分区,因为这个表的数据量相当大(每天有30-40百万行)。这有所帮助,但问题仍然没有永久解决。
如何确保查询优化器始终使用良好的执行计划?
英文:
I have my jetty application connects to Oracle database to persist application metadata.
I have below Sql query fired by application :
select * from my_table
where id = 'some_id' --> not unique, i can have repeated Ids
and created > sysdate - 160/24
and started <= to_timestamp('07-06-2023 06:12:45', 'dd-mm-yyyy hh24:mi:ss')
and status in (3, 13, 14);
All table column are indexed except started column.
I observed that some time this query is taking very long to respond around 5 minutes. Generally it get served with in few seconds.
After further deep dive, we found that optimizer chooses bad plan when it is taking longer. That time it is filtering on the basis of id column . It works well when it scan via status.
I tried partitioning this table every 7 days as volume is quite high (30-40 million rows per day) in this table. It helped but still this issue did not go permanently.
How can i make sure that query optimizer always uses good plan ?
答案1
得分: 1
在不更改您的索引的情况下,您可以通过提示您的查询来获得稳定性。例如,如果status
上的索引被称为idx_status
,则可以:
select /*+ INDEX(my_table idx_status) */ * from my_table
where id = 'some_id' --> not unique, i can have repeated Ids
and created > sysdate - 160/24
and started <= to_timestamp('07-06-2023 06:12:45', 'dd-mm-yyyy hh24:mi:ss')
and status in (3, 13, 14);
尽管如此,如果有大量早于一周的记录具有这些状态,您也应该在索引级别清除它们。如果您的id
字段在任何重要程度上减少了结果,那么也应该对其进行索引。如果是这种情况,您需要创建我们所称的连接或复合索引。从左到右,首先是您执行等值谓词的列,然后是您执行IN
操作的列,最后是您执行范围/不等式操作的列。因此,索引将如下所示:
CREATE OR REPLACE INDEX idx_composite ON my_table(id, status, created)
这可能会激发优化器,以便不需要提示即可使用它。为确保,请删除ID
上的现有索引,让这个新的复合索引代替它。很可能这将为您提供最佳性能。
英文:
Without changing your indexing, you can get stability by hinting your query. If the index on status
is called idx_status
, for example, then:
select /*+ INDEX(my_table idx_status) */ * from my_table
where id = 'some_id' --> not unique, i can have repeated Ids
and created > sysdate - 160/24
and started <= to_timestamp('07-06-2023 06:12:45', 'dd-mm-yyyy hh24:mi:ss')
and status in (3, 13, 14);
Though, if you have a significant number of records with those statuses that are older than a week, you will want to weed those out at the index level as well. And, if your id
field reduces your results in any significant way, that too should be indexed. If this is the case, then you need what we call a concatenated or composite or multi-column index. From left to right, start with columns you do an equality predicate on, then columns you do IN
on, then lastly columns you do a range/ inequality on. So the index would be:
CREATE OR REPLACE INDEX idx_composite ON my_table(id,status,created)
That will probably entice the optimizer so much that no hint is needed to get it to be used. To make sure though, you should drop the existing index on ID
and let this new composite one take its place. Most likely this will give you the best performance.
答案2
得分: 0
你可以使用SQL计划基线。像这样:
SET SERVEROUTPUT ON
DECLARE
l_plans_loaded PLS_INTEGER;
BEGIN
l_plans_loaded := DBMS_SPM.load_plans_from_cursor_cache(
sql_id => '&sql_id',
plan_hash_value => '&plan_hash_value',
sql_handle => '&handle');
DBMS_OUTPUT.put_line('Plans Loaded: ' || l_plans_loaded);
END;
/
英文:
You can use SQL plan baseline. Like this:
SET SERVEROUTPUT ON
DECLARE
l_plans_loaded PLS_INTEGER;
BEGIN
l_plans_loaded := DBMS_SPM.load_plans_from_cursor_cache(
sql_id => '&sql_id',
plan_hash_value => '&plan_hash_value',
sql_handle => '&handle');
DBMS_OUTPUT.put_line('Plans Loaded: ' || l_plans_loaded);
END;
/
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论