英文:
Write a query to return parent location base on level (p2)
问题
在我的之前的问题中,我可以解决如何获取位置层次结构的问题。但是,当尝试与系统集成时,似乎出现了另一个问题。
所以我有两个表格,如下:
位置(约20,000条记录)
id | 名称 | 是否激活 | level1 | level2 | level3 | level4 | |
---|---|---|---|---|---|---|---|
1 | A | true | A | null | null | null | |
2 | A > B | true | A | B | null | null | |
3 | A > B > C | true | A | B | C | null | |
4 | A > B > C > D | true | A | B | C | D | |
5 | X | true | X | null | null | null |
日志(约5百万条记录)
id | 名称 | location_id | |
---|---|---|---|
1 | 日志001 | 3 | |
2 | 日志002 | 5 |
所以目标是我想获取只有日志的位置层次结构。
问题是当我尝试与日志表连接时,性能存在问题。查询似乎需要超过10分钟。
以下是我的查询样式demo
我尝试了这个查询
SELECT *
FROM location
START WITH id IN (SELECT DISTINCT location_id FROM log_response)
CONNECT BY PRIOR name
= name || '>' || COALESCE(PRIOR level4, PRIOR level3, PRIOR level2)
ORDER BY NAME
或者
WITH cte(location_id) AS (
SELECT DISTINCT location_id FROM log_response
)
SELECT *
FROM location
START WITH id IN (SELECT location_id FROM cte)
CONNECT BY PRIOR name
= name || '>' || COALESCE(PRIOR level4, PRIOR level3, PRIOR level2)
ORDER BY NAME
附注:在上面的示例中,查询的期望结果应该返回具有id在(1、2、3、5)的位置。
id | 名称 | 是否激活 | level1 | level2 | level3 | level4 | |
---|---|---|---|---|---|---|---|
1 | A | true | A | null | null | null | |
2 | A > B | true | A | B | null | null | |
3 | A > B > C | true | A | B | C | null | |
5 | X | true | X | null | null | null |
非常感谢!
英文:
At the my previous question, I can solve how to fetch the location hierarchy. But It seems I have another issue when try to integrate with the system
so I have 2 tables like
location (~20k records)
id | Name | is_active | level1 | level2 | level3 | level4 | |
---|---|---|---|---|---|---|---|
1 | A | true | A | null | null | null | |
2 | A>B | true | A | B | null | null | |
3 | A>B>C | true | A | B | C | null | |
4 | A>B>C>D | true | A | B | C | D | |
5 | X | true | X | null | null | null |
log (~5m records)
id | Name | location_id | |
---|---|---|---|
1 | log 001 | 3 | |
2 | log 002 | 5 |
So the goal is I would like to fetch the location hierarchy which has log only.
The problem is when I try to join with the log table, It has a performance issue. The query seems to take 10+ minutes
Here is how my query looks like demo
I'm trying this query
SELECT *
FROM location
START WITH id IN (SELECT DISTINCT location_id FROM log_response)
CONNECT BY PRIOR name
= name || '>' || COALESCE(PRIOR level4, PRIOR level3, PRIOR level2)
ORDER BY NAME
or
WITH cte(location_id) AS (
SELECT DISTINCT location_id FROM log_response
)
SELECT *
FROM location
START WITH id IN (SELECT location_id FROM cte)
CONNECT BY PRIOR name
= name || '>' || COALESCE(PRIOR level4, PRIOR level3, PRIOR level2)
ORDER BY NAME
p/s The expectation of the query on the above example should return the location have id in (1,2,3,5)
id | Name | is_active | level1 | level2 | level3 | level4 | |
---|---|---|---|---|---|---|---|
1 | A | true | A | null | null | null | |
2 | A>B | true | A | B | null | null | |
3 | A>B>C | true | A | B | C | null | |
5 | X | true | X | null | null | null |
Thank you very much!
答案1
得分: 0
你可以使用 EXISTS
(而不是 IN
和 DISTINCT
):
SELECT *
FROM location l
START WITH EXISTS(SELECT 1 FROM log WHERE location_id = l.id)
CONNECT BY name = SUBSTR(PRIOR name, 1, INSTR(PRIOR name, '>', -1) - 1)
对于示例数据:
CREATE TABLE location (id, Name, is_active, level1, level2, level3, level4) AS
SELECT 1, 'A', 'true', 'A', null, null, null FROM DUAL UNION ALL
SELECT 2, 'A>B', 'true', 'A', 'B', null, null FROM DUAL UNION ALL
SELECT 3, 'A>B>C', 'true', 'A', 'B', 'C', null FROM DUAL UNION ALL
SELECT 4, 'A>B>C>D', 'true', 'A', 'B', 'C', 'D' FROM DUAL UNION ALL
SELECT 5, 'X', 'true', 'X', null, null, null FROM DUAL;
CREATE TABLE log (id, Name, location_id) AS
SELECT 1, 'log 001', 3 FROM DUAL UNION ALL
SELECT 2, 'log 002', 5 FROM DUAL;
输出:
ID | NAME | IS_ACTIVE | LEVEL1 | LEVEL2 | LEVEL3 | LEVEL4 |
---|---|---|---|---|---|---|
3 | A>B>C | true | A | B | C | null |
2 | A>B | true | A | B | null | null |
1 | A | true | A | null | null | null |
5 | X | true | X | null | null | null |
或:
SELECT DISTINCT *
FROM location l
START WITH EXISTS(SELECT 1 FROM log WHERE location_id = l.id)
CONNECT BY
PRIOR level1 = level1
AND ( ( level2 IS NULL AND level3 IS NULL AND level4 IS NULL
AND PRIOR level2 IS NOT NULL AND PRIOR level3 IS NULL AND PRIOR level4 IS NULL
)
OR ( PRIOR level2 = level2
AND ( ( level3 IS NULL AND level4 IS NULL
AND PRIOR level3 IS NOT NULL AND PRIOR level4 IS NULL
)
OR ( PRIOR level3 = level3
AND level4 IS NULL AND PRIOR level4 IS NOT NULL
)
)
)
)
英文:
You can use EXISTS
(rather than IN
and DISTINCT
):
SELECT *
FROM location l
START WITH EXISTS(SELECT 1 FROM log WHERE location_id = l.id)
CONNECT BY name = SUBSTR(PRIOR name, 1, INSTR(PRIOR name, '>', -1) - 1)
Which, for the sample data:
CREATE TABLE location (id, Name, is_active, level1, level2, level3, level4) AS
SELECT 1, 'A', 'true', 'A', null, null, null FROM DUAL UNION ALL
SELECT 2, 'A>B', 'true', 'A', 'B', null, null FROM DUAL UNION ALL
SELECT 3, 'A>B>C', 'true', 'A', 'B', 'C', null FROM DUAL UNION ALL
SELECT 4, 'A>B>C>D', 'true', 'A', 'B', 'C', 'D' FROM DUAL UNION ALL
SELECT 5, 'X', 'true', 'X', null, null, null FROM DUAL;
CREATE TABLE log (id, Name, location_id) AS
SELECT 1, 'log 001', 3 FROM DUAL UNION ALL
SELECT 2, 'log 002', 5 FROM DUAL;
Outputs:
ID | NAME | IS_ACTIVE | LEVEL1 | LEVEL2 | LEVEL3 | LEVEL4 |
---|---|---|---|---|---|---|
3 | A>B>C | true | A | B | C | null |
2 | A>B | true | A | B | null | null |
1 | A | true | A | null | null | null |
5 | X | true | X | null | null | null |
Or:
SELECT DISTINCT *
FROM location l
START WITH EXISTS(SELECT 1 FROM log WHERE location_id = l.id)
CONNECT BY
PRIOR level1 = level1
AND ( ( level2 IS NULL AND level3 IS NULL AND level4 IS NULL
AND PRIOR level2 IS NOT NULL AND PRIOR level3 IS NULL AND PRIOR level4 IS NULL
)
OR ( PRIOR level2 = level2
AND ( ( level3 IS NULL AND level4 IS NULL
AND PRIOR level3 IS NOT NULL AND PRIOR level4 IS NULL
)
OR ( PRIOR level3 = level3
AND level4 IS NULL AND PRIOR level4 IS NOT NULL
)
)
)
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论