英文:
Extract distinct graph-linked values from two columns
问题
output
1
5
2
4
8
英文:
ColumnA | CoumnB |
---|---|
1 | 5 |
2 | 1 |
3 | 10 |
4 | 8 |
1 | 4 |
1 | 5 |
2 | 5 |
In this,
- 1 is link with 5
- 5 is link with 2
- 2 is link with 1
- 1 is link with 4
- 4 is link with 8
- 1 is link with 5
- 5 is link with 2
So the output what I want is distinct values that are linked with each other
output |
---|
1 |
5 |
2 |
4 |
8 |
答案1
得分: 1
以下是翻译好的内容:
第一段代码:
一种试图解决这个问题的方法是以下步骤:
- 提取不同的配对组合
- 计算第一列数值的频率
- 提取每个第一列出现超过一次或在第二列中至少出现一次的配对
- 将数据线性化
每个步骤都在一个单独的公共表达式(CTE)中完成。
WITH cte AS (
SELECT DISTINCT LEAST(ColumnA, ColumnB) AS col1,
GREATEST(ColumnA, ColumnB) AS col2
FROM tab
), cte2 AS (
SELECT col1, col2,
COUNT(col1) OVER(PARTITION BY col1) AS cnt_col1
FROM cte t1
), cte3 AS (
SELECT * FROM cte2 WHERE cnt_col1 > 1
OR EXISTS(SELECT 1 FROM cte2 t2 WHERE cte2.col1 = t2.col2)
)
SELECT col1 FROM cte3 UNION SELECT col2 FROM cte3
第二段代码:
如果您需要从特定索引开始,并且正在寻找图中所有链接节点,您需要使用递归查询:
- 基本步骤:提取特定id的<ColumnA, ColumnB>的唯一记录
- 递归步骤:提取cte.ColumnB = tab.ColumnA上连续的记录。
然后只需要在输出中收集ColumnB的不同值。在以下情况中,ColumnA = 1,如果您想更改它,需要更改基本步骤的筛选条件中的值 WHERE LEAST(ColumnA, ColumnB) = <your_value>
。
WITH cte AS (
SELECT DISTINCT LEAST(ColumnA, ColumnB) AS ColumnA,
GREATEST(ColumnA, ColumnB) AS ColumnB
FROM tab
WHERE LEAST(ColumnA, ColumnB) = 1
UNION ALL
SELECT tab.ColumnA, tab.ColumnB
FROM cte
INNER JOIN tab
ON cte.ColumnB = tab.ColumnA
)
SELECT DISTINCT ColumnB FROM cte
输出(对于id = 1):
col1 |
---|
1 |
2 |
4 |
5 |
8 |
在此处查看演示链接。
英文:
One approach that attempts to solve this problem is the following:
- extract distinct combinations of your pairs
- count the frequency of first column values
- extract each pairs whose first column appears more than once, or that appears in the second column at least once
- linearize your data
Each of these steps is done in a separate cte.
WITH cte AS (
SELECT DISTINCT LEAST(ColumnA, ColumnB) AS col1,
GREATEST(ColumnA, ColumnB) AS col2
FROM tab
), cte2 AS (
SELECT col1, col2,
COUNT(col1) OVER(PARTITION BY col1) AS cnt_col1
FROM cte t1
), cte3 AS (
SELECT * FROM cte2 WHERE cnt_col1 > 1
OR EXISTS(SELECT 1 FROM cte2 t2 WHERE cte2.col1 = t2.col2)
)
SELECT col1 FROM cte3 UNION SELECT col2 FROM cte3
If you instead need to begin from a specific index and are looking for all linked nodes in the graph, you need a recursive query:
- Base step: extracts unique records of <ColumnA, ColumnB> for a specific id
- Recursive step: extracts consecutive records on cte.ColumnB = tab.ColumnA.
Then it's just sufficient to gather distinct values of ColumnB in the output. In the following case ColumnA = 1, if you want to change it, you need to change the value in the filtering condition of the base step WHERE LEAST(ColumnA, ColumnB) = <your_value>
WITH cte AS (
SELECT DISTINCT LEAST(ColumnA, ColumnB) AS ColumnA,
GREATEST(ColumnA, ColumnB) AS ColumnB
FROM tab
WHERE LEAST(ColumnA, ColumnB) = 1
UNION ALL
SELECT tab.ColumnA, tab.ColumnB
FROM cte
INNER JOIN tab
ON cte.ColumnB = tab.ColumnA
)
SELECT DISTINCT ColumnB FROM cte
Output (for id = 1):
col1 |
---|
1 |
2 |
4 |
5 |
8 |
Check the demo here.
答案2
得分: 0
以下是您要翻译的代码部分:
我猜以下是您要查找的内容,但我认为您的示例数据在期望的输出方面不正确:
DROP TABLE IF EXISTS #TempTable
CREATE TABLE #TempTable (
ColumnA INT,
ColumnB INT
)
INSERT INTO #TempTable (ColumnA, ColumnB)
VALUES (1, 5), (2, 1), (3, 10), (4, 8), (1, 4), (1, 5), (2, 5)
SELECT group_id
,ColumnA
,ColumnB
,COUNT(*)
FROM
(
SELECT 1 as group_id,*
FROM #TempTable
UNION ALL
SELECT 2,ColumnB, ColumnA
FROM #TempTable
) DS
GROUP BY group_id
,ColumnA
,ColumnB
HAVING COUNT(*) = 2
该想法是将数据合并为两个组,并仅获取具有计数2的记录 - 意味着我们有x-y和y-x的情况。
上述查询的输出如下:
[![enter image description here][1]][1]
然后,通过另一个嵌套查询或CTE,您可以获取所需的列。
注意:由于您的请求,我仅提供了代码的中文翻译,不包括其他内容。
英文:
I guess the following is what you are looking for, but I believe your sample data is not correct in respect of the desired output:
DROP TABLE IF EXISTS #TempTable
CREATE TABLE #TempTable (
ColumnA INT,
ColumnB INT
)
INSERT INTO #TempTable (ColumnA, ColumnB)
VALUES (1, 5), (2, 1), (3, 10), (4, 8), (1, 4), (1, 5), (2, 5)
SELECT group_id
,ColumnA
,ColumnB
,COUNT(*)
FROM
(
SELECT 1 as group_id,*
FROM #TempTable
UNION ALL
SELECT 2,ColumnB, ColumnA
FROM #TempTable
) DS
GROUP BY group_id
,ColumnA
,ColumnB
HAVING COUNT(*) = 2
The idea is to unite the data in two groups and get only this records with count 2 - meaning we have x-y and y-x scenario.
The output of the above query is:
From there with another nested query or CTE you can get only the columns you want.
答案3
得分: 0
根据简单的数据和预期输出,我认为这是您要找的:
with cte as (
select ColumnA, ColumnB
from mytable
UNION ALL
SELECT t.ColumnA, t.ColumnB
from mytable t
inner join cte c on c.ColumnB = t.ColumnA
)
select ColumnA AS linked_values
from cte
group by ColumnA
having count(*) > 1
union
select ColumnB
from cte
group by ColumnB
having count(*) > 1
结果:
linked_values
1
2
4
5
8
英文:
Bsed on the simple data and the expected output, I think this is what you are looking for :
with cte as (
select ColumnA, ColumnB
from mytable
UNION ALL
SELECT t.ColumnA, t.ColumnB
from mytable t
inner join cte c on c.ColumnB = t.ColumnA
)
select ColumnA AS linked_values
from cte
group by ColumnA
having count(*) > 1
union
select ColumnB
from cte
group by ColumnB
having count(*) > 1
Result :
linked_values
1
2
4
5
8
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论