从两列中提取不同的与图链接的值。

huangapple go评论68阅读模式
英文:

Extract distinct graph-linked values from two columns

问题

output

1
5
2
4
8

英文:
ColumnA CoumnB
1 5
2 1
3 10
4 8
1 4
1 5
2 5

In this,

  • 1 is link with 5
  • 5 is link with 2
  • 2 is link with 1
  • 1 is link with 4
  • 4 is link with 8
  • 1 is link with 5
  • 5 is link with 2

So the output what I want is distinct values that are linked with each other

output
1
5
2
4
8

答案1

得分: 1

以下是翻译好的内容:

第一段代码:

一种试图解决这个问题的方法是以下步骤:

  • 提取不同的配对组合
  • 计算第一列数值的频率
  • 提取每个第一列出现超过一次或在第二列中至少出现一次的配对
  • 将数据线性化

每个步骤都在一个单独的公共表达式(CTE)中完成。

WITH cte AS (
    SELECT DISTINCT LEAST(ColumnA, ColumnB)    AS col1, 
                    GREATEST(ColumnA, ColumnB) AS col2
    FROM tab
), cte2 AS (
    SELECT col1, col2, 
           COUNT(col1) OVER(PARTITION BY col1) AS cnt_col1
    FROM cte t1
), cte3 AS (
    SELECT * FROM cte2 WHERE cnt_col1 > 1 
                          OR EXISTS(SELECT 1 FROM cte2 t2 WHERE cte2.col1 = t2.col2)
)
SELECT col1 FROM cte3 UNION SELECT col2 FROM cte3

第二段代码:

如果您需要从特定索引开始,并且正在寻找图中所有链接节点,您需要使用递归查询:

  • 基本步骤:提取特定id的<ColumnA, ColumnB>的唯一记录
  • 递归步骤:提取cte.ColumnB = tab.ColumnA上连续的记录。

然后只需要在输出中收集ColumnB的不同值。在以下情况中,ColumnA = 1,如果您想更改它,需要更改基本步骤的筛选条件中的值 WHERE LEAST(ColumnA, ColumnB) = &lt;your_value&gt;

WITH cte AS (
    SELECT DISTINCT LEAST(ColumnA, ColumnB)    AS ColumnA,
                    GREATEST(ColumnA, ColumnB) AS ColumnB
    FROM tab 
    WHERE LEAST(ColumnA, ColumnB) = 1

    UNION ALL

    SELECT tab.ColumnA, tab.ColumnB 
    FROM cte 
    INNER JOIN tab
            ON cte.ColumnB = tab.ColumnA
)
SELECT DISTINCT ColumnB FROM cte

输出(对于id = 1):

col1
1
2
4
5
8

在此处查看演示链接

英文:

One approach that attempts to solve this problem is the following:

  • extract distinct combinations of your pairs
  • count the frequency of first column values
  • extract each pairs whose first column appears more than once, or that appears in the second column at least once
  • linearize your data

Each of these steps is done in a separate cte.

WITH cte AS (
    SELECT DISTINCT LEAST(ColumnA, ColumnB)    AS col1, 
                    GREATEST(ColumnA, ColumnB) AS col2
    FROM tab
), cte2 AS (
    SELECT col1, col2, 
           COUNT(col1) OVER(PARTITION BY col1) AS cnt_col1
    FROM cte t1
), cte3 AS (
    SELECT * FROM cte2 WHERE cnt_col1 &gt; 1 
                          OR EXISTS(SELECT 1 FROM cte2 t2 WHERE cte2.col1 = t2.col2)
)
SELECT col1 FROM cte3 UNION SELECT col2 FROM cte3

If you instead need to begin from a specific index and are looking for all linked nodes in the graph, you need a recursive query:

  • Base step: extracts unique records of <ColumnA, ColumnB> for a specific id
  • Recursive step: extracts consecutive records on cte.ColumnB = tab.ColumnA.

Then it's just sufficient to gather distinct values of ColumnB in the output. In the following case ColumnA = 1, if you want to change it, you need to change the value in the filtering condition of the base step WHERE LEAST(ColumnA, ColumnB) = &lt;your_value&gt;

WITH cte AS (
    SELECT DISTINCT LEAST(ColumnA, ColumnB)    AS ColumnA,
                    GREATEST(ColumnA, ColumnB) AS ColumnB
    FROM tab 
    WHERE LEAST(ColumnA, ColumnB) = 1

    UNION ALL

    SELECT tab.ColumnA, tab.ColumnB 
    FROM cte 
    INNER JOIN tab
            ON cte.ColumnB = tab.ColumnA
)
SELECT DISTINCT ColumnB FROM cte

Output (for id = 1):

col1
1
2
4
5
8

Check the demo here.

答案2

得分: 0

以下是您要翻译的代码部分:

我猜以下是您要查找的内容,但我认为您的示例数据在期望的输出方面不正确:

    DROP TABLE IF EXISTS #TempTable
    
    CREATE TABLE #TempTable (
      ColumnA INT,
      ColumnB INT
    )
    
    
    INSERT INTO #TempTable (ColumnA, ColumnB)
    VALUES (1, 5), (2, 1), (3, 10), (4, 8), (1, 4), (1, 5), (2, 5)
    
    SELECT group_id
    	  ,ColumnA
    	  ,ColumnB
    	  ,COUNT(*)
    FROM
    (
    	SELECT 1 as group_id,*
    	FROM #TempTable
    	UNION ALL
    	SELECT 2,ColumnB, ColumnA
    	FROM #TempTable
    ) DS
    GROUP BY group_id
    	    ,ColumnA
    	    ,ColumnB
    HAVING COUNT(*) = 2

该想法是将数据合并为两个组,并仅获取具有计数2的记录 - 意味着我们有x-y和y-x的情况

上述查询的输出如下:

[![enter image description here][1]][1]

然后,通过另一个嵌套查询或CTE,您可以获取所需的列。

注意:由于您的请求,我仅提供了代码的中文翻译,不包括其他内容。

英文:

I guess the following is what you are looking for, but I believe your sample data is not correct in respect of the desired output:

DROP TABLE IF EXISTS #TempTable

CREATE TABLE #TempTable (
  ColumnA INT,
  ColumnB INT
)


INSERT INTO #TempTable (ColumnA, ColumnB)
VALUES (1, 5), (2, 1), (3, 10), (4, 8), (1, 4), (1, 5), (2, 5)

SELECT group_id
	  ,ColumnA
	  ,ColumnB
	  ,COUNT(*)
FROM
(
	SELECT 1 as group_id,*
	FROM #TempTable
	UNION ALL
	SELECT 2,ColumnB, ColumnA
	FROM #TempTable
) DS
GROUP BY group_id
	    ,ColumnA
	    ,ColumnB
HAVING COUNT(*) = 2

The idea is to unite the data in two groups and get only this records with count 2 - meaning we have x-y and y-x scenario.

The output of the above query is:

从两列中提取不同的与图链接的值。

From there with another nested query or CTE you can get only the columns you want.

答案3

得分: 0

根据简单的数据和预期输出,我认为这是您要找的:

with cte as (
  select  ColumnA, ColumnB
  from mytable
  UNION ALL
  SELECT t.ColumnA, t.ColumnB
  from mytable t
  inner join cte c on c.ColumnB = t.ColumnA
)
select ColumnA AS linked_values
from cte
group by ColumnA
having count(*) > 1
union 
select ColumnB
from cte
group by ColumnB
having count(*) > 1

结果:

linked_values
1
2
4
5
8
英文:

Bsed on the simple data and the expected output, I think this is what you are looking for :

with cte as (
  select  ColumnA, ColumnB
  from mytable
  UNION ALL
  SELECT t.ColumnA, t.ColumnB
  from mytable t
  inner join cte c on c.ColumnB = t.ColumnA
)
select ColumnA AS linked_values
from cte
group by ColumnA
having count(*) &gt; 1
union 
select ColumnB
from cte
group by ColumnB
having count(*) &gt; 1

Result :

linked_values
1
2
4
5
8

Demo here

huangapple
  • 本文由 发表于 2023年6月1日 15:50:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/76379745.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定