英文:
Oracle : retrieving ORA_ROWSCN dramatically slow down a query on a big table (with no where clause)
问题
我想要使用伪列ORA_ROWSCN来增量地向数据目标提供数据,但我遇到了一个我不理解的问题。检索这个伪列会显著减慢我的查询... 非常明显。
我正在处理一个庞大的数据库。我在示例中使用的两个表分别有13,528,885行和12,701,489行。
让我们看看这两个不同的查询:
案例A:
select
ta.id,
tb.id
from table_a ta
left join table_b tb
on ta.tb_id = tb.id
fetch first 1 row only;
这个查询已经相当长了(持续时间只是作为参考)
- 第一次运行:10秒
- 第二次运行:2秒
案例B:
在选择中添加2个ora_rowscn字段
select
ta.id,
tb.id,
ta.ora_rowscn versa,
tb.ora_rowscn versb
from table_a ta
left join table_b tb
on ta.tb_id = tb.id
fetch first 1 row only;
- 第一次运行:20秒
- 第二次运行:20秒
所以第二个查询比第一个查询要长得多。但在我的实际情况下,包含许多连接的查询结果更糟糕,包含ora_rowscn检索和不包含它的查询之间的比率高达50。
这种策略变得不可用
欢迎提供有关这种行为的任何信息或提示。
我要强调的是,我在网上搜索,没有找到与这个限制有关的任何信息。我认为在这里提出这个问题会很有趣。
谢谢!
英文:
I would like to use the pseudocolumn ORA_ROWSCN in order to incrementally feed a data target but I'm facing something that I don't understand. Retrieving this pseudocolumn slow down my query.. A lot.
I'm working on a big database. The 2 tables I use in example are respectively 13 528 885 and 12 701 489 rows.
Let's take these 2 different queries :
CASE A:
select
ta.id,
tb.id
from table_a ta
left join table_b tb
on ta.tb_id = tb.id
fetch first 1 row only;
This query is already quite long (the durations are just here as an indication)
- 1st run : 10s
- 2nd run : 2s
CASE B:
Add 2 ora_rowscn fields to the select
select
ta.id,
tb.id,
ta.ora_rowscn versa,
tb.ora_rowscn versb
from table_a ta
left join table_b tb
on ta.tb_id = tb.id
fetch first 1 row only;
- 1st run : 20s
- 2nd run : 20s
So the second query is really longer than the first. But in my real case query containing many join the result is even worst, the ratio between the query with and without the ora_rowscn retrieval is up to 50.
Making this strategy unusable
Any information or tips about this behaviour is welcome.
I would precise that I searched online and didn't find anything related to this constraint. I thought that it would be interesting to ask the question here.
Thanks !
答案1
得分: 3
第一个查询的版本可以使用索引来完成大部分工作,而第二个版本必须访问表以获取系统更改编号。
我假设ID
列是主键并已建立索引。TB_ID
可能是外键,通常也会建立索引。这意味着在第一个查询中使用的每一列都是索引的一部分。Oracle可以从索引中检索所有必要的数据,甚至不需要访问任何表格。
ORA_ROWSCN
未建立索引,需要进行表格查找。我假设索引比表格小得多,因此一旦需要访问表格,就会有更多的数据要读取。ORA_ROWSCN
没有什么特殊之处,对于任何未建立索引的值,您都会看到相同的问题。
创建示例模式
--删除表格table_a;
--删除表格table_b;
创建表格table_b(id number primary key, b number);
创建表格table_a(id number primary key, tb_id number not null references table_b(id));
创建表格table_a_tb_id索引在table_a(tb_id);
收集执行计划
解释以下计划,用于选择以下内容:
ta.id,
tb.id
从表格table_a ta
左连接表格table_b tb
在ta.tb_id = tb.id
仅获取第1行;
选择*从表格(table(dbms_xplan.display(format => 'basic')));
解释以下计划,用于选择以下内容:
ta.id,
tb.id,
ta.ora_rowscn versa,
tb.ora_rowscn versb
从表格table_a ta
左连接表格table_b tb
在ta.tb_id = tb.id
仅获取第1行;
选择*从表格(table(dbms_xplan.display(format => 'basic')));
比较执行计划
CASE A只使用TABLE_B的索引扫描。
计划哈希值: 218395200
| Id | Operation | Name |
| 0 | SELECT STATEMENT | |
| 1 | VIEW | |
| 2 | WINDOW NOSORT STOPKEY| |
| 3 | NESTED LOOPS OUTER | |
| 4 | TABLE ACCESS FULL | TABLE_A |
| 5 | INDEX UNIQUE SCAN | SYS_C0017639 |
CASE B仍然可以使用TABLE_B上的索引,但现在还必须从表格本身读取。
计划哈希值: 259330422
| Id | Operation | Name |
| 0 | SELECT STATEMENT | |
| 1 | VIEW | |
| 2 | WINDOW NOSORT STOPKEY | |
| 3 | NESTED LOOPS OUTER | |
| 4 | TABLE ACCESS FULL | TABLE_A |
| 5 | TABLE ACCESS BY INDEX ROWID| TABLE_B |
| 6 | INDEX UNIQUE SCAN | SYS_C0017639 |
英文:
The first version of the query can use indexes for most of the work, the second version must access the table to get the System Change Number.
I assume that the columns ID
are primary keys and indexed. TB_ID
is probably a foreign key, which are also usually indexed. This means that literally every column used in the first query is part of an index. Oracle can retrieve all the necessary data from indexes and doesn't even need to access any of the tables.
ORA_ROWSCN
isn't indexed, and requires a table lookup. I assume the indexes are much smaller than the table, so once a table access is necessary there's a lot more data to read. There's nothing special about ORA_ROWSCN
, you would see the same issue with any un-indexed value.
Create sample schema
--drop table table_a;
--drop table table_b;
create table table_b(id number primary key, b number);
create table table_a(id number primary key, tb_id number not null references table_b(id));
create index table_a_tb_id on table_a(tb_id);
Gather execution plans
explain plan for
select
ta.id,
tb.id
from table_a ta
left join table_b tb
on ta.tb_id = tb.id
fetch first 1 row only;
select * from table(dbms_xplan.display(format => 'basic'));
explain plan for
select
ta.id,
tb.id,
ta.ora_rowscn versa,
tb.ora_rowscn versb
from table_a ta
left join table_b tb
on ta.tb_id = tb.id
fetch first 1 row only;
select * from table(dbms_xplan.display(format => 'basic'));
Compare Execution Plans
CASE A only uses an index scan for TABLE_B.
Plan hash value: 218395200
-----------------------------------------------
| Id | Operation | Name |
-----------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | |
| 2 | WINDOW NOSORT STOPKEY| |
| 3 | NESTED LOOPS OUTER | |
| 4 | TABLE ACCESS FULL | TABLE_A |
| 5 | INDEX UNIQUE SCAN | SYS_C0017639 |
-----------------------------------------------
CASE B can still use the index on TABLE_B, but now must also read from the table itself.
Plan hash value: 259330422
-------------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | |
| 2 | WINDOW NOSORT STOPKEY | |
| 3 | NESTED LOOPS OUTER | |
| 4 | TABLE ACCESS FULL | TABLE_A |
| 5 | TABLE ACCESS BY INDEX ROWID| TABLE_B |
| 6 | INDEX UNIQUE SCAN | SYS_C0017639 |
-------------------------------------------------------
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论