英文:
What exactly does Oracle NOCYCLE parameter do in hierarchical queries?
问题
Oracle在多年前引入了一个名为NOCYCLE的可选参数用于层次查询。然而,对于它的确切作用,除了在检测到循环时避免抛出错误之外,很少有明确的说明。我在网上看到的大部分资料都是错误或误导性的。例如,权威的Oracle-Base说:
CONNECT BY NOCYCLE子句告诉数据库不要遍历循环的层次结构。(错误!)
官方文档可能故意保持中立:
NOCYCLE参数指示Oracle数据库返回查询的行,即使在数据中存在CONNECT BY循环。
这倒是没错,但它并没有说明查询何时会停止遵循循环。
事实上,在某些情况下,使用CONNECT BY NOCYCLE遍历循环图时,Oracle将进入循环,但会在完成之前停止,返回一个终止响应。然而,在其他情况下,Oracle将愉快地永远循环下去。(事实上,上面的Oracle文章提到了可能发生无限循环的情况,尽管上下文不同。)
对于带有NOCYCLE子句的层次查询,是否存在一种规定的行为以处理循环?
英文:
Years ago Oracle introduced an optional NOCYCLE parameter to hierarchical queries. However, there is little clarity as to what exactly it does, besides avoiding throwing an error when a cycle is detected. Most of the material that I have seen on the web is wrong or misleading. For example, the authoritative Oracle-Base says:
> CONNECT BY NOCYCLE clause tells the database not to traverse cyclical hierarchies. (Wrong!)
The official documentation is, perhaps intentionally, noncommittal:
> The NOCYCLE parameter instructs Oracle Database to return rows from a query even if a CONNECT BY loop exists in the data.
True enough, but this doesn't say when or even whether the query will stop following a loop when it runs into one.
The truth is that in some cases, when traversing a cyclical graph with CONNECT BY NOCYCLE Oracle will enter loops, but stop just short of completing them, returning a terminating response. In other cases, however, Oracle will happily run in circles forever. (Indeed, the above Oracle article mentions cases when an infinite loop may occur, albeit not in the same context.)
Is there a prescribed way for a hierarchical query with NOCYCLE clause to behave with respect to cycles?
ETA: Here is a working example of an apparently infinite loop that you can try in http://sqlfiddle.com, borrowed from this post. (I am not providing a permalink, because the script somehow becomes corrupted after it is run once, so you have to start a new one each time.)
Setup:
CREATE TABLE edges (tail CHAR(1), head CHAR(1));
INSERT INTO edges values('A','B');
INSERT INTO edges values('A','C');
INSERT INTO edges values('A','D');
INSERT INTO edges values('A','E');
INSERT INTO edges values('A','F');
INSERT INTO edges values('A','G');
INSERT INTO edges values('A','H');
INSERT INTO edges values('A','I');
INSERT INTO edges values('A','J');
INSERT INTO edges values('B','K');
INSERT INTO edges values('B','L');
INSERT INTO edges values('B','C');
INSERT INTO edges values('B','E');
INSERT INTO edges values('B','M');
INSERT INTO edges values('B','N');
INSERT INTO edges values('B','O');
INSERT INTO edges values('B','J');
INSERT INTO edges values('C','K');
INSERT INTO edges values('C','N');
INSERT INTO edges values('C','J');
INSERT INTO edges values('D','K');
INSERT INTO edges values('D','B');
INSERT INTO edges values('D','L');
INSERT INTO edges values('D','C');
INSERT INTO edges values('D','E');
INSERT INTO edges values('D','F');
INSERT INTO edges values('D','G');
INSERT INTO edges values('D','M');
INSERT INTO edges values('D','P');
INSERT INTO edges values('D','Q');
INSERT INTO edges values('D','N');
INSERT INTO edges values('D','R');
INSERT INTO edges values('D','O');
INSERT INTO edges values('D','J');
INSERT INTO edges values('E','K');
INSERT INTO edges values('E','L');
INSERT INTO edges values('E','S');
INSERT INTO edges values('E','M');
INSERT INTO edges values('E','N');
INSERT INTO edges values('E','O');
INSERT INTO edges values('E','J');
INSERT INTO edges values('F','K');
INSERT INTO edges values('F','B');
INSERT INTO edges values('F','L');
INSERT INTO edges values('F','C');
INSERT INTO edges values('F','G');
INSERT INTO edges values('F','S');
INSERT INTO edges values('F','M');
INSERT INTO edges values('F','N');
INSERT INTO edges values('E','O');
INSERT INTO edges values('E','J');
INSERT INTO edges values('G','K');
INSERT INTO edges values('G','B');
INSERT INTO edges values('G','L');
INSERT INTO edges values('G','E');
INSERT INTO edges values('G','M');
INSERT INTO edges values('G','N');
INSERT INTO edges values('G','O');
INSERT INTO edges values('G','J');
INSERT INTO edges values('H','K');
INSERT INTO edges values('H','B');
INSERT INTO edges values('H','L');
INSERT INTO edges values('H','C');
INSERT INTO edges values('H','D');
INSERT INTO edges values('H','E');
INSERT INTO edges values('H','F');
INSERT INTO edges values('H','G');
INSERT INTO edges values('H','M');
INSERT INTO edges values('H','P');
INSERT INTO edges values('H','Q');
INSERT INTO edges values('H','R');
INSERT INTO edges values('H','O');
INSERT INTO edges values('H','J');
INSERT INTO edges values('I','K');
INSERT INTO edges values('I','B');
INSERT INTO edges values('I','L');
INSERT INTO edges values('I','C');
INSERT INTO edges values('I','D');
INSERT INTO edges values('I','E');
INSERT INTO edges values('I','F');
INSERT INTO edges values('I','G');
INSERT INTO edges values('I','H');
INSERT INTO edges values('I','M');
INSERT INTO edges values('I','P');
INSERT INTO edges values('I','Q');
INSERT INTO edges values('I','N');
INSERT INTO edges values('I','R');
INSERT INTO edges values('I','O');
INSERT INTO edges values('I','J');
INSERT INTO edges values('J','K');
INSERT INTO edges values('J','L');
INSERT INTO edges values('J','N');
INSERT INTO edges values('J','O');
INSERT INTO edges values('K','N');
INSERT INTO edges values('K','J');
INSERT INTO edges values('L','K');
INSERT INTO edges values('L','N');
INSERT INTO edges values('L','J');
INSERT INTO edges values('M','K');
INSERT INTO edges values('M','B');
INSERT INTO edges values('M','L');
INSERT INTO edges values('M','C');
INSERT INTO edges values('M','E');
INSERT INTO edges values('M','F');
INSERT INTO edges values('M','G');
INSERT INTO edges values('M','N');
INSERT INTO edges values('M','O');
INSERT INTO edges values('M','J');
INSERT INTO edges values('M','F');
INSERT INTO edges values('N','J');
INSERT INTO edges values('O','K');
INSERT INTO edges values('O','L');
INSERT INTO edges values('O','N');
INSERT INTO edges values('O','J');
INSERT INTO edges values('P','K');
INSERT INTO edges values('P','A');
INSERT INTO edges values('P','B');
INSERT INTO edges values('P','L');
INSERT INTO edges values('P','T');
INSERT INTO edges values('P','C');
INSERT INTO edges values('P','D');
INSERT INTO edges values('P','E');
INSERT INTO edges values('P','F');
INSERT INTO edges values('P','G');
INSERT INTO edges values('P','H');
INSERT INTO edges values('P','I');
INSERT INTO edges values('P','N');
INSERT INTO edges values('P','R');
INSERT INTO edges values('P','O');
INSERT INTO edges values('P','J');
INSERT INTO edges values('Q','H');
INSERT INTO edges values('Q','I');
INSERT INTO edges values('Q','D');
INSERT INTO edges values('Q','E');
INSERT INTO edges values('Q','F');
INSERT INTO edges values('Q','G');
INSERT INTO edges values('Q','N');
INSERT INTO edges values('Q','L');
INSERT INTO edges values('Q','K');
INSERT INTO edges values('Q','R');
INSERT INTO edges values('Q','T');
INSERT INTO edges values('Q','A');
INSERT INTO edges values('Q','C');
INSERT INTO edges values('Q','B');
INSERT INTO edges values('Q','O');
INSERT INTO edges values('Q','J');
INSERT INTO edges values('R','K');
INSERT INTO edges values('R','B');
INSERT INTO edges values('R','L');
INSERT INTO edges values('R','C');
INSERT INTO edges values('R','G');
INSERT INTO edges values('R','M');
INSERT INTO edges values('R','N');
INSERT INTO edges values('R','O');
INSERT INTO edges values('R','J');
INSERT INTO edges values('S','K');
INSERT INTO edges values('S','B');
INSERT INTO edges values('S','L');
INSERT INTO edges values('S','C');
INSERT INTO edges values('S','E');
INSERT INTO edges values('S','F');
INSERT INTO edges values('S','G');
INSERT INTO edges values('S','N');
INSERT INTO edges values('S','O');
INSERT INTO edges values('S','J');
INSERT INTO edges values('T','K');
INSERT INTO edges values('T','A');
INSERT INTO edges values('T','B');
INSERT INTO edges values('T','L');
INSERT INTO edges values('T','C');
INSERT INTO edges values('T','D');
INSERT INTO edges values('T','E');
INSERT INTO edges values('T','F');
INSERT INTO edges values('T','G');
INSERT INTO edges values('T','H');
INSERT INTO edges values('T','I');
INSERT INTO edges values('T','M');
INSERT INTO edges values('T','P');
INSERT INTO edges values('T','Q');
INSERT INTO edges values('T','R');
INSERT INTO edges values('T','O');
INSERT INTO edges values('T','J');
Script:
select tail, head
from edges
connect by nocycle
prior head = tail
start with head = 'A';
答案1
得分: 3
从CONNECT_BY_ISCYCLE
文档中提到:
CONNECT_BY_ISCYCLE
伪列
CONNECT_BY_ISCYCLE
伪列返回1,如果当前行有一个子行,该子行也是其祖先。否则返回0。只有在
CONNECT BY
子句的NOCYCLE
参数已指定时才能指定CONNECT_BY_ISCYCLE
。NOCYCLE
使Oracle能够返回一个查询的结果,否则该查询由于数据中的CONNECT BY
循环而失败。
因此,根据文档:
NOCYCLE
允许Oracle返回一个查询的结果,否则该查询由于数据中的CONNECT BY
循环而失败。
而循环发生在:
当前行有一个子行,该子行也是其祖先。
因此,带有NOCYCLE
子句的分层查询将返回层次结构的所有行,并且在到达循环时(即已经访问的层次结构中的祖先)停止向下遍历。
如果有以下示例数据:
CREATE TABLE table_name (parent_id, id) AS
SELECT NULL, 'A' FROM DUAL UNION ALL
SELECT 'A', 'B' FROM DUAL UNION ALL
SELECT 'B', 'C' FROM DUAL UNION ALL
SELECT 'C', 'A' FROM DUAL UNION ALL
SELECT 'C', 'D' FROM DUAL UNION ALL
SELECT 'D', 'B' FROM DUAL UNION ALL
SELECT 'C', 'E' FROM DUAL UNION ALL
SELECT 'E', 'F' FROM DUAL;
那么以下查询:
SELECT parent_id,
id,
SYS_CONNECT_BY_PATH(id, '-') AS path,
CONNECT_BY_ISCYCLE
FROM table_name
START WITH
parent_id IS NULL
CONNECT BY NOCYCLE
PRIOR id = parent_id
ORDER SIBLINGS BY id;
将输出:
PARENT_ID | ID | PATH | CONNECT_BY_ISCYCLE |
---|---|---|---|
null | A | -A | 0 |
A | B | -A-B | 0 |
B | C | -A-B-C | 1 |
C | D | -A-B-C-D | 1 |
C | E | -A-B-C-E | 0 |
E | F | -A-B-C-E-F | 0 |
NOCYCLE
子句防止显示行C, A
和D, B
,因为它们连接到已经在层次结构中访问过的祖先(由CONNECT BY
子句中使用的id
列定义),并将CONNECT_BY_IS_CYCLE
设置为1
,分别在父行B, C
和C, D
上(分别)因为它们具有循环后代。
如果您想显示这些循环行,可以包括自连接以确保获取层次结构中的下一行:
SELECT p.parent_id,
p.id AS current_id,
n.id AS next_id,
SYS_CONNECT_BY_PATH(p.id, '-') || '-' || n.id AS path,
CONNECT_BY_ISCYCLE
FROM table_name p
INNER JOIN table_name n
ON (p.id = n.parent_id)
START WITH
p.parent_id IS NULL
CONNECT BY NOCYCLE
PRIOR n.ROWID = p.ROWID
ORDER SIBLINGS BY p.id, n.id;
这将输出:
PARENT_ID | CURRENT_ID | NEXT_ID | PATH | CONNECT_BY_ISCYCLE |
---|---|---|---|---|
null | A | B | -A-B | 0 |
A | B | C | -A-B-C | 0 |
B | C | A | -A-B-C-A | 1 |
B | C | D | -A-B-C-D | 0 |
C | D | B | -A-B-C-D-B | 1 |
B | C | E | -A-B-C-E | 0 |
C | E | F | -A-B-C-E-F | 0 |
这正确地识别了层次结构中的A, B, C, A
和A, B, C, D, B
循环,并且NOCYCLE
子句再次允许SQL引擎处理数据而不会陷入无限循环。在这种情况下,使用p.ROWID
伪列(实际上是指向数据文件中行位置的指针)检测到循环,但INNER JOIN
允许获取层次结构中的下一行,并将包括具有循环的行以及之前的行,并会在层次结构的下一次迭代中检测到该循环。
英文:
From the CONNECT_BY_ISCYCLE
documentation (which is linked from the documentation linked in your question):
> ### CONNECT_BY_ISCYCLE
Pseudocolumn
>
> The CONNECT_BY_ISCYCLE
pseudocolumn returns 1 if the current row has a child which is also its ancestor. Otherwise it returns 0.
>
> You can specify CONNECT_BY_ISCYCLE
only if you have specified the NOCYCLE
parameter of the CONNECT BY
clause. NOCYCLE
enables Oracle to return the results of a query that would otherwise fail because of a CONNECT BY
loop in the data.
Therefore, as documented:
> NOCYCLE
enables Oracle to return the results of a query that would otherwise fail because of a CONNECT BY
loop in the data.
and a cycle occurs when:
> the current row has a child which is also its ancestor.
Therefore, a hierarchical query with the NOCYCLE
clause will return all the rows of the hierarchy and will stop descending the hierachy when it would reach a cycle (i.e. an ancestor in the hierarchy that has already been visited).
If you have the sample data:
CREATE TABLE table_name (parent_id, id) AS
SELECT NULL, 'A' FROM DUAL UNION ALL
SELECT 'A', 'B' FROM DUAL UNION ALL
SELECT 'B', 'C' FROM DUAL UNION ALL
SELECT 'C', 'A' FROM DUAL UNION ALL
SELECT 'C', 'D' FROM DUAL UNION ALL
SELECT 'D', 'B' FROM DUAL UNION ALL
SELECT 'C', 'E' FROM DUAL UNION ALL
SELECT 'E', 'F' FROM DUAL;
Then the query:
SELECT parent_id,
id,
SYS_CONNECT_BY_PATH(id, '-') AS path,
CONNECT_BY_ISCYCLE
FROM table_name
START WITH
parent_id IS NULL
CONNECT BY NOCYCLE
PRIOR id = parent_id
ORDER SIBLINGS BY id;
Outputs:
PARENT_ID | ID | PATH | CONNECT_BY_ISCYCLE |
---|---|---|---|
null | A | -A | 0 |
A | B | -A-B | 0 |
B | C | -A-B-C | 1 |
C | D | -A-B-C-D | 1 |
C | E | -A-B-C-E | 0 |
E | F | -A-B-C-E-F | 0 |
The NOCYCLE
clause prevents the rows C, A
and D, B
from being displayed as they connect to an ancestor (as defined by the id
column used in the CONNECT BY
clause) that has already been visited in the hierarchy and sets CONNECT_BY_IS_CYCLE
to 1
on the parent rows B, C
and C, D
(respectively) as they have cyclic descendants.
If you want to display those cyclic rows then you can include a self-join to ensure that you get the next row in the hierarchy:
SELECT p.parent_id,
p.id AS current_id,
n.id AS next_id,
SYS_CONNECT_BY_PATH(p.id, '-') || '-' || n.id AS path,
CONNECT_BY_ISCYCLE
FROM table_name p
INNER JOIN table_name n
ON (p.id = n.parent_id)
START WITH
p.parent_id IS NULL
CONNECT BY NOCYCLE
PRIOR n.ROWID = p.ROWID
ORDER SIBLINGS BY p.id, n.id;
Which outputs:
PARENT_ID | CURRENT_ID | NEXT_ID | PATH | CONNECT_BY_ISCYCLE |
---|---|---|---|---|
null | A | B | -A-B | 0 |
A | B | C | -A-B-C | 0 |
B | C | A | -A-B-C-A | 1 |
B | C | D | -A-B-C-D | 0 |
C | D | B | -A-B-C-D-B | 1 |
B | C | E | -A-B-C-E | 0 |
C | E | F | -A-B-C-E-F | 0 |
Which accurately identifies the A, B, C, A
and A, B, C, D, B
cycles in the hierarchy and the NOCYCLE
clause, again, allows the SQL engine to process the data without getting into an infinite loop. In this case, the cycle is detected using the p.ROWID
pseudo-column (which is effectively a pointer to the row's location in the data file) but the INNER JOIN
allows the next row in the hierarchy to be fetched and will include the row with the cycle alongside the previous row and would detect that the cycle in the next iteration of the hierarchy.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论