Merge arrays in a specific order without duplicates with group_by.

huangapple go评论84阅读模式
英文:

Merge arrays in a specific order without duplicates with group_by

问题

我有一个名为user_groups的表,其中包含以下列:

  • usr_id - 这是用户ID。
  • groups - 这是一个包含给定组系列中组标识符的数组。
  • priority - 优先级。

我需要一个查询,该查询将返回一个包含所有用户组合并到一个用户行的数组。
例如,具有id = 1的用户在"user_groups"中有两个条目,一个包含组系列:[1, 2, 3, 5],优先级为1,另一个包含组系列:[2, 5, 10, 12],优先级为2。

对于这个用户,结果应该是一个数组,其中包含较低优先级的组列表首先出现,即[1, 2, 3, 5],然后应添加具有较高优先级的每个组系列,需要考虑到这些组的标识符不能重复,这意味着对于优先级为2的条目,不再添加组25,因为这些组已经在较低优先级组的系列中包含在内。希望这不会令人困惑。

为此,可以创建以下视图:

CREATE VIEW merged_user_groups AS
SELECT usr_id, ARRAY_AGG(DISTINCT grp) AS groups
FROM (
    SELECT DISTINCT ON (usr_id, grp) usr_id, grp
    FROM (
        SELECT unnest(groups) AS grp, usr_id, priority
        FROM user_groups
    ) AS subquery
    ORDER BY usr_id, priority
) AS distinct_groups
GROUP BY usr_id;

这将创建一个名为merged_user_groups的视图,其中包含合并后的用户组信息。

如果您在第一行中更改优先级以使其最高,然后再次将其更改为第二行,则合并将按照第二行,然后第一行的顺序进行。结果应该是:

usr_id groups
1 {2, 5, 10, 12, 1, 3}

这应该满足您的需求。请注意,这个解决方案基于PostgreSQL 13.11。

英文:

I have a table user_groups containing the following columns:

  • usr_id - this is the user ID.
  • groups - this is an array with group identifiers in a given group family.
  • priority - priority.

I need a query that will return an array containing all user groups merged into one row per user.
For example, a user with id = 1 has two entries in "user_groups" one containing a family of groups: [1, 2, 3, 5] with priority = 1, the other containing a family of groups: [2, 5, 10, 12], where priority = 2.

The result for this user should be an array with a list of groups with a lower priority first, i.e. [1, 2, 3, 5] and then each family with a higher priority should be added, taking into account that the identifiers of these groups cannot be repeated, means is that for a priority 2 entry, I no longer add groups 2 and 5, because these were already included earlier in the family of lower priority groups. I hope it's not confusing.

View for that:

usr_id groups priority
1 {1, 2, 3, 5} 1
1 {2, 5, 10, 12} 2

Result should be:

usr_id groups
1 {1, 2, 3, 5, 10, 12}

If we changed priority in 1 row to highest than in second row then the merge should happen in the order first second row, then first row. The result should be:

usr_id groups
1 {2, 5, 10, 12, 1, 3}

I've already tried unpacking the array with unnest and then using array_agg with ORDER BY to combine it back together, but that merges the duplicate elements:

SELECT 
    usr_id, ARRAY_AGG(grp ORDER BY prior_) 
FROM (SELECT unnest("groups") AS grp, * FROM user_groups ) AS "groups" 
GROUP BY usr_id 

So I came up with the idea to combine these arrays with the DISTINCT clause in array_agg, but I cannot use it with ORDER BY (Get an error: SQL Error [42P10]: ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list), and removing ORDER BY causes me to lose the order that matters.

Version of PostgreSQL: 13.11

Any ideas?

答案1

得分: 1

Rewriting your query will result in your desired outcome.

基本问题是我们需要筛选出重复的数字并保持优先级不变,因为所有重复的数字应该出现在较低的优先级中,我们需要通过 GROUP BY usr_id, grp 来消除重复的数字,然后为每个数字选择最小的优先级。

CREATE TABLE user_groups
    ("usr_id" int, "groups" integer[], "priority" int)
;

INSERT INTO user_groups
    ("usr_id", "groups", "priority")
VALUES
    (1, '{1, 2, 3, 5}', 3),
    (1, '{2, 5, 10, 12}', 2),
    (2, '{1, 2, 3, 5}', 1),
    (2, '{2, 5, 10, 12}', 2)
;
WITH CTE AS
  (SELECT usr_id, grp, MIN(priority) priority FROM user_groups, unnest("groups") AS grp
   GROUP BY usr_id, grp)
SELECT usr_id, ARRAY_AGG(grp ORDER BY priority, grp) AS grps
FROM CTE
GROUP BY usr_id

结果如下:

usr_id grps
1 {1,2,3,5,10,12}
2 {1,2,3,5,10,12}

fiddle

英文:

Rewriting ypur query will result in your wanted result

the basic problem is we need to filter out double numbers and keep still the priority, as all doubled numbers should appear in the lower priority, we need to eliminate the double numbers with GROUP By usr_id,grp for all users, and the choose for every number the smallest priority

CREATE TABLE user_groups
    ("usr_id" int, "groups" integer[], "priority" int)
;
    
INSERT INTO user_groups
    ("usr_id", "groups", "priority")
VALUES
    (1, '{1, 2, 3, 5}', 3),
    (1, '{2, 5, 10, 12}', 2),
      (2, '{1, 2, 3, 5}', 1),
    (2, '{2, 5, 10, 12}', 2)
;

> status
> CREATE TABLE
>

> status
> INSERT 0 4
>

WITH CTE As
  (SELECT  usr_id,grp, MIN(priority) priority FROM user_groups,unnest("groups") AS grp
GROUP By usr_id,grp)
SELECT usr_id, ARRAY_AGG( grp ORDER BY "priority",grp) as grps
  FROM CTE
GROUP BY usr_id
usr_id grps
1 {2,5,10,12,1,3}
2 {1,2,3,5,10,12}
> ``` status
> SELECT 2
> ```

fiddle

huangapple
  • 本文由 发表于 2023年6月13日 03:46:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/76459850.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定