英文:
Merge arrays in a specific order without duplicates with group_by
问题
我有一个名为user_groups
的表,其中包含以下列:
- usr_id - 这是用户ID。
- groups - 这是一个包含给定组系列中组标识符的数组。
- priority - 优先级。
我需要一个查询,该查询将返回一个包含所有用户组合并到一个用户行的数组。
例如,具有id = 1的用户在"user_groups"中有两个条目,一个包含组系列:[1, 2, 3, 5]
,优先级为1,另一个包含组系列:[2, 5, 10, 12]
,优先级为2。
对于这个用户,结果应该是一个数组,其中包含较低优先级的组列表首先出现,即[1, 2, 3, 5]
,然后应添加具有较高优先级的每个组系列,需要考虑到这些组的标识符不能重复,这意味着对于优先级为2的条目,不再添加组2
和5
,因为这些组已经在较低优先级组的系列中包含在内。希望这不会令人困惑。
为此,可以创建以下视图:
CREATE VIEW merged_user_groups AS
SELECT usr_id, ARRAY_AGG(DISTINCT grp) AS groups
FROM (
SELECT DISTINCT ON (usr_id, grp) usr_id, grp
FROM (
SELECT unnest(groups) AS grp, usr_id, priority
FROM user_groups
) AS subquery
ORDER BY usr_id, priority
) AS distinct_groups
GROUP BY usr_id;
这将创建一个名为merged_user_groups
的视图,其中包含合并后的用户组信息。
如果您在第一行中更改优先级以使其最高,然后再次将其更改为第二行,则合并将按照第二行,然后第一行的顺序进行。结果应该是:
usr_id | groups |
---|---|
1 | {2, 5, 10, 12, 1, 3} |
这应该满足您的需求。请注意,这个解决方案基于PostgreSQL 13.11。
英文:
I have a table user_groups
containing the following columns:
- usr_id - this is the user ID.
- groups - this is an array with group identifiers in a given group family.
- priority - priority.
I need a query that will return an array containing all user groups merged into one row per user.
For example, a user with id = 1 has two entries in "user_groups" one containing a family of groups: [1, 2, 3, 5]
with priority = 1, the other containing a family of groups: [2, 5, 10, 12]
, where priority = 2.
The result for this user should be an array with a list of groups with a lower priority first, i.e. [1, 2, 3, 5]
and then each family with a higher priority should be added, taking into account that the identifiers of these groups cannot be repeated, means is that for a priority 2 entry, I no longer add groups 2
and 5
, because these were already included earlier in the family of lower priority groups. I hope it's not confusing.
View for that:
usr_id | groups | priority |
---|---|---|
1 | {1, 2, 3, 5} | 1 |
1 | {2, 5, 10, 12} | 2 |
Result should be:
usr_id | groups |
---|---|
1 | {1, 2, 3, 5, 10, 12} |
If we changed priority in 1 row to highest than in second row then the merge should happen in the order first second row, then first row. The result should be:
usr_id | groups |
---|---|
1 | {2, 5, 10, 12, 1, 3} |
I've already tried unpacking the array with unnest
and then using array_agg
with ORDER BY
to combine it back together, but that merges the duplicate elements:
SELECT
usr_id, ARRAY_AGG(grp ORDER BY prior_)
FROM (SELECT unnest("groups") AS grp, * FROM user_groups ) AS "groups"
GROUP BY usr_id
So I came up with the idea to combine these arrays with the DISTINCT clause in array_agg, but I cannot use it with ORDER BY (Get an error: SQL Error [42P10]: ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
), and removing ORDER BY
causes me to lose the order that matters.
Version of PostgreSQL: 13.11
Any ideas?
答案1
得分: 1
Rewriting your query will result in your desired outcome.
基本问题是我们需要筛选出重复的数字并保持优先级不变,因为所有重复的数字应该出现在较低的优先级中,我们需要通过 GROUP BY usr_id, grp 来消除重复的数字,然后为每个数字选择最小的优先级。
CREATE TABLE user_groups
("usr_id" int, "groups" integer[], "priority" int)
;
INSERT INTO user_groups
("usr_id", "groups", "priority")
VALUES
(1, '{1, 2, 3, 5}', 3),
(1, '{2, 5, 10, 12}', 2),
(2, '{1, 2, 3, 5}', 1),
(2, '{2, 5, 10, 12}', 2)
;
WITH CTE AS
(SELECT usr_id, grp, MIN(priority) priority FROM user_groups, unnest("groups") AS grp
GROUP BY usr_id, grp)
SELECT usr_id, ARRAY_AGG(grp ORDER BY priority, grp) AS grps
FROM CTE
GROUP BY usr_id
结果如下:
usr_id | grps |
---|---|
1 | {1,2,3,5,10,12} |
2 | {1,2,3,5,10,12} |
英文:
Rewriting ypur query will result in your wanted result
the basic problem is we need to filter out double numbers and keep still the priority, as all doubled numbers should appear in the lower priority, we need to eliminate the double numbers with GROUP By usr_id,grp for all users, and the choose for every number the smallest priority
CREATE TABLE user_groups
("usr_id" int, "groups" integer[], "priority" int)
;
INSERT INTO user_groups
("usr_id", "groups", "priority")
VALUES
(1, '{1, 2, 3, 5}', 3),
(1, '{2, 5, 10, 12}', 2),
(2, '{1, 2, 3, 5}', 1),
(2, '{2, 5, 10, 12}', 2)
;
> status
> CREATE TABLE
>
> status
> INSERT 0 4
>
WITH CTE As
(SELECT usr_id,grp, MIN(priority) priority FROM user_groups,unnest("groups") AS grp
GROUP By usr_id,grp)
SELECT usr_id, ARRAY_AGG( grp ORDER BY "priority",grp) as grps
FROM CTE
GROUP BY usr_id
usr_id | grps |
---|---|
1 | {2,5,10,12,1,3} |
2 | {1,2,3,5,10,12} |
> ``` status | |
> SELECT 2 | |
> ``` |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论