如何将组标识符合并为单个组?

huangapple go评论62阅读模式
英文:

How can I combine Group Identifiers into Single Group?

问题

I have a dataset as follows:

;WITH CTE AS
( SELECT *
  FROM (VALUES
(1, 10, 20, 30)
(2, 10, 21, 31)
(3, 11, 21, 31)
(4, 12, 22, 32)
(5, 13, 23, 33)
(6, 14, 24, 33)
(7, 14, 25, 34)
(8, 15, 26, 36)
) AS MyValues(ID, GroupID1, GroupID2, GroupID3)
)
SELECT *
FROM CTE

How can I generate the following collapsing the individual groups into a single group?

ID|SingleGroupID
--|--
1| 1
2| 1
3| 1
4| 2
5| 3
6| 3
7| 3
8| 4
英文:

I have a dataset as follows:

;WITH CTE AS
( SELECT *
  FROM (VALUES
(1, 10, 20, 30)
(2, 10, 21, 31)
(3, 11, 21, 31)
(4, 12, 22, 32)
(5, 13, 23, 33)
(6, 14, 24, 33)
(7, 14, 25, 34)
(8, 15, 26, 36)
) AS MyValues(ID, GroupID1, GroupID2, GroupID3)
)
SELECT *
FROM CTE

How can I generate the following collapsing the individual groups into a single group?

ID SingleGroupID
1 1
2 1
3 1
4 2
5 3
6 3
7 3
8 4

答案1

得分: 5

这是一个典型的图遍历问题,你需要构建具有至少一个共同组的id的岛屿。

我们可以通过将组展开为节点列表,并使用自连接生成边(连接具有相同组的id)。然后,我们可以递归遍历这些边,同时跟踪我们所走过的路径。最后一步是按id进行聚合。

所以:

with 
    nodes as (
        select t.id, v.grp
        from mytable t
        cross apply ( values (t.GroupID1), (t.GroupID2), (t.GroupID3) ) v(grp)
    ),
    edges as (
        select distinct n1.id as id1, n2.id as id2
        from nodes n1
        inner join nodes n2 on n1.grp = n2.grp
    ),
    rec as (
        select id1, id2, cast(id1 as nvarchar(max)) as visited from edges
        union all
        select r.id1, e.id2, concat(r.visited, ',', e.id2)
        from rec r
        inner join edges e on e.id1 = r.id2
        where concat(',', r.visited, ',', e.id2, ',') not like concat('%,', e.id2, ',%')
    ),
    fin as (
        select id1, min(value) min_id
        from rec r
        cross apply string_split(r.visited, ',')
        group by id1
    )
select id1 as id, dense_rank() over(order by min_id) grp
from fin f

演示链接

英文:

This is a typical graph-walking problem, where you want to build islands of ids that have at least one group in common.

We can start by unpivoting the groups to build a list of nodes, and generate edges with a self-join (edges connect ids that share the same group). We can then recursively traverse the edges, while keeping track of the path we followed until there. The last step is to aggregate by id.

So:

with 
    nodes as (
        select t.id, v.grp
        from mytable t
        cross apply ( values (t.GroupID1), (t.GroupID2), (t.GroupID3) ) v(grp)
    ),
    edges as (
        select distinct n1.id as id1, n2.id as id2
        from nodes n1
        inner join nodes n2 on n1.grp = n2.grp
    ),
    rec as (
        select id1, id2, cast(id1 as nvarchar(max)) as visited from edges
        union all
        select r.id1, e.id2, concat(r.visited, ',', e.id2)
        from rec r
        inner join edges e on e.id1 = r.id2
        where concat(',', r.visited, ',') not like concat('%,', e.id2, ',%')
    ),
    fin as (
        select id1, min(value) min_id
        from rec r
        cross apply string_split(r.visited, ',')
        group by id1
    )
select id1 as id, dense_rank() over(order by min_id) grp
from fin f
id grp
1 1
2 1
3 1
4 2
5 3
6 3
7 3
8 4

fiddle

huangapple
  • 本文由 发表于 2023年5月17日 21:26:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76272634.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定