2023年6月12日 18:09:12go评论96阅读模式

英文:

Oracle query optimizer chooses bad plan at times with same query

问题

我有一个Jetty应用程序连接到Oracle数据库以持久化应用程序元数据。
应用程序触发的SQL查询如下：

select * from my_table
where id = 'some_id' --&gt; 不是唯一的，可能会有重复的Id
and created &gt; sysdate - 160/24 
and started &lt;= to_timestamp('07-06-2023 06:12:45', 'dd-mm-yyyy hh24:mi:ss') 
and status in (3, 13, 14);

所有表列都有索引，除了"started"列。

我注意到有时这个查询响应时间很长，大约需要5分钟。通常情况下，它会在几秒内完成。
经过进一步的深入研究，我们发现当查询响应时间较长时，优化器选择了不良的执行计划。在这种情况下，它是基于"Id"列进行筛选的。但当通过"status"进行扫描时，它运行良好。
我尝试将这个表每7天进行分区，因为这个表的数据量相当大（每天有30-40百万行）。这有所帮助，但问题仍然没有永久解决。
如何确保查询优化器始终使用良好的执行计划？

英文:

I have my jetty application connects to Oracle database to persist application metadata.
I have below Sql query fired by application :

select * from my_table
 where id = &#39;some_id&#39; --&gt; not unique, i can have repeated Ids
 and created &gt; sysdate - 160/24 
 and started &lt;= to_timestamp(&#39;07-06-2023 06:12:45&#39;, &#39;dd-mm-yyyy hh24:mi:ss&#39;) 
 and status in (3, 13, 14);

All table column are indexed except started column.

I observed that some time this query is taking very long to respond around 5 minutes. Generally it get served with in few seconds.
After further deep dive, we found that optimizer chooses bad plan when it is taking longer. That time it is filtering on the basis of id column . It works well when it scan via status.
I tried partitioning this table every 7 days as volume is quite high (30-40 million rows per day) in this table. It helped but still this issue did not go permanently.
How can i make sure that query optimizer always uses good plan ?

答案1

得分: 1

在不更改您的索引的情况下，您可以通过提示您的查询来获得稳定性。例如，如果status上的索引被称为idx_status，则可以：

select /*+ INDEX(my_table idx_status) */ * from my_table
 where id = 'some_id' --> not unique, i can have repeated Ids
 and created > sysdate - 160/24 
 and started <= to_timestamp('07-06-2023 06:12:45', 'dd-mm-yyyy hh24:mi:ss') 
 and status in (3, 13, 14);

尽管如此，如果有大量早于一周的记录具有这些状态，您也应该在索引级别清除它们。如果您的id字段在任何重要程度上减少了结果，那么也应该对其进行索引。如果是这种情况，您需要创建我们所称的连接或复合索引。从左到右，首先是您执行等值谓词的列，然后是您执行IN操作的列，最后是您执行范围/不等式操作的列。因此，索引将如下所示：

CREATE OR REPLACE INDEX idx_composite ON my_table(id, status, created)

这可能会激发优化器，以便不需要提示即可使用它。为确保，请删除ID上的现有索引，让这个新的复合索引代替它。很可能这将为您提供最佳性能。

英文:

Without changing your indexing, you can get stability by hinting your query. If the index on status is called idx_status, for example, then:

select /*+ INDEX(my_table idx_status) */ * from my_table
 where id = &#39;some_id&#39; --&gt; not unique, i can have repeated Ids
 and created &gt; sysdate - 160/24 
 and started &lt;= to_timestamp(&#39;07-06-2023 06:12:45&#39;, &#39;dd-mm-yyyy hh24:mi:ss&#39;) 
 and status in (3, 13, 14);

Though, if you have a significant number of records with those statuses that are older than a week, you will want to weed those out at the index level as well. And, if your id field reduces your results in any significant way, that too should be indexed. If this is the case, then you need what we call a concatenated or composite or multi-column index. From left to right, start with columns you do an equality predicate on, then columns you do IN on, then lastly columns you do a range/ inequality on. So the index would be:

CREATE OR REPLACE INDEX idx_composite ON my_table(id,status,created)

That will probably entice the optimizer so much that no hint is needed to get it to be used. To make sure though, you should drop the existing index on ID and let this new composite one take its place. Most likely this will give you the best performance.

答案2

得分: 0

你可以使用SQL计划基线。像这样：

SET SERVEROUTPUT ON
DECLARE
  l_plans_loaded  PLS_INTEGER;
BEGIN
  l_plans_loaded := DBMS_SPM.load_plans_from_cursor_cache(
    sql_id => '&sql_id',
    plan_hash_value => '&plan_hash_value',
    sql_handle => '&handle');
  DBMS_OUTPUT.put_line('Plans Loaded: ' || l_plans_loaded);
END;
/

英文:

You can use SQL plan baseline. Like this:

SET SERVEROUTPUT ON
DECLARE
  l_plans_loaded  PLS_INTEGER;
BEGIN
  l_plans_loaded := DBMS_SPM.load_plans_from_cursor_cache(
    sql_id =&gt; &#39;&amp;sql_id&#39;,
    plan_hash_value =&gt; &#39;&amp;plan_hash_value&#39;,
    sql_handle =&gt; &#39;&amp;handle&#39;);
  DBMS_OUTPUT.put_line(&#39;Plans Loaded: &#39; || l_plans_loaded);
END;
/

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Oracle查询优化器有时会选择相同查询的不良计划。

问题

答案1

答案2

正则表达式以忽略数据的某些部分中的分隔符

SQL Error [4098] [42000]: ORA-04098在尝试在触发器之后插入数据时发生错误。

如何比较两列，仅使用日期而不包括时间。

Oracle序列生成器在序列号存储在Java变量中时返回负值。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。