2023年6月13日 03:46:09go评论108阅读模式

英文:

Merge arrays in a specific order without duplicates with group_by

问题

我有一个名为user_groups的表，其中包含以下列：

usr_id - 这是用户ID。
groups - 这是一个包含给定组系列中组标识符的数组。
priority - 优先级。

我需要一个查询，该查询将返回一个包含所有用户组合并到一个用户行的数组。
例如，具有id = 1的用户在"user_groups"中有两个条目，一个包含组系列：[1, 2, 3, 5]，优先级为1，另一个包含组系列：[2, 5, 10, 12]，优先级为2。

对于这个用户，结果应该是一个数组，其中包含较低优先级的组列表首先出现，即[1, 2, 3, 5]，然后应添加具有较高优先级的每个组系列，需要考虑到这些组的标识符不能重复，这意味着对于优先级为2的条目，不再添加组2和5，因为这些组已经在较低优先级组的系列中包含在内。希望这不会令人困惑。

为此，可以创建以下视图：

CREATE VIEW merged_user_groups AS
SELECT usr_id, ARRAY_AGG(DISTINCT grp) AS groups
FROM (
    SELECT DISTINCT ON (usr_id, grp) usr_id, grp
    FROM (
        SELECT unnest(groups) AS grp, usr_id, priority
        FROM user_groups
    ) AS subquery
    ORDER BY usr_id, priority
) AS distinct_groups
GROUP BY usr_id;

这将创建一个名为merged_user_groups的视图，其中包含合并后的用户组信息。

如果您在第一行中更改优先级以使其最高，然后再次将其更改为第二行，则合并将按照第二行，然后第一行的顺序进行。结果应该是：

usr_id	groups
1	{2, 5, 10, 12, 1, 3}

这应该满足您的需求。请注意，这个解决方案基于PostgreSQL 13.11。

英文:

I have a table user_groups containing the following columns:

usr_id - this is the user ID.
groups - this is an array with group identifiers in a given group family.
priority - priority.

I need a query that will return an array containing all user groups merged into one row per user.
For example, a user with id = 1 has two entries in "user_groups" one containing a family of groups: [1, 2, 3, 5] with priority = 1, the other containing a family of groups: [2, 5, 10, 12], where priority = 2.

The result for this user should be an array with a list of groups with a lower priority first, i.e. [1, 2, 3, 5] and then each family with a higher priority should be added, taking into account that the identifiers of these groups cannot be repeated, means is that for a priority 2 entry, I no longer add groups 2 and 5, because these were already included earlier in the family of lower priority groups. I hope it's not confusing.

View for that:

usr_id	groups	priority
1	{1, 2, 3, 5}	1
1	{2, 5, 10, 12}	2

Result should be:

usr_id	groups
1	{1, 2, 3, 5, 10, 12}

If we changed priority in 1 row to highest than in second row then the merge should happen in the order first second row, then first row. The result should be:

usr_id	groups
1	{2, 5, 10, 12, 1, 3}

I've already tried unpacking the array with unnest and then using array_agg with ORDER BY to combine it back together, but that merges the duplicate elements:

SELECT 
    usr_id, ARRAY_AGG(grp ORDER BY prior_) 
FROM (SELECT unnest(&quot;groups&quot;) AS grp, * FROM user_groups ) AS &quot;groups&quot; 
GROUP BY usr_id

So I came up with the idea to combine these arrays with the DISTINCT clause in array_agg, but I cannot use it with ORDER BY (Get an error: SQL Error [42P10]: ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list), and removing ORDER BY causes me to lose the order that matters.

Version of PostgreSQL: 13.11

Any ideas?

答案1

得分: 1

Rewriting your query will result in your desired outcome.

基本问题是我们需要筛选出重复的数字并保持优先级不变，因为所有重复的数字应该出现在较低的优先级中，我们需要通过 GROUP BY usr_id, grp 来消除重复的数字，然后为每个数字选择最小的优先级。

CREATE TABLE user_groups
    ("usr_id" int, "groups" integer[], "priority" int)
;
INSERT INTO user_groups
    ("usr_id", "groups", "priority")
VALUES
    (1, '{1, 2, 3, 5}', 3),
    (1, '{2, 5, 10, 12}', 2),
    (2, '{1, 2, 3, 5}', 1),
    (2, '{2, 5, 10, 12}', 2)
;

WITH CTE AS
  (SELECT usr_id, grp, MIN(priority) priority FROM user_groups, unnest("groups") AS grp
   GROUP BY usr_id, grp)
SELECT usr_id, ARRAY_AGG(grp ORDER BY priority, grp) AS grps
FROM CTE
GROUP BY usr_id

结果如下：

usr_id	grps
1	{1,2,3,5,10,12}
2	{1,2,3,5,10,12}

fiddle

英文:

Rewriting ypur query will result in your wanted result

the basic problem is we need to filter out double numbers and keep still the priority, as all doubled numbers should appear in the lower priority, we need to eliminate the double numbers with GROUP By usr_id,grp for all users, and the choose for every number the smallest priority

CREATE TABLE user_groups
    (&quot;usr_id&quot; int, &quot;groups&quot; integer[], &quot;priority&quot; int)
;
    
INSERT INTO user_groups
    (&quot;usr_id&quot;, &quot;groups&quot;, &quot;priority&quot;)
VALUES
    (1, &#39;{1, 2, 3, 5}&#39;, 3),
    (1, &#39;{2, 5, 10, 12}&#39;, 2),
      (2, &#39;{1, 2, 3, 5}&#39;, 1),
    (2, &#39;{2, 5, 10, 12}&#39;, 2)
;

> status > CREATE TABLE >

> status > INSERT 0 4 >

WITH CTE As
  (SELECT  usr_id,grp, MIN(priority) priority FROM user_groups,unnest(&quot;groups&quot;) AS grp
GROUP By usr_id,grp)
SELECT usr_id, ARRAY_AGG( grp ORDER BY &quot;priority&quot;,grp) as grps
  FROM CTE
GROUP BY usr_id

usr_id	grps
1	{2,5,10,12,1,3}
2	{1,2,3,5,10,12}
> ``` status
> SELECT 2
> ```

fiddle

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Merge arrays in a specific order without duplicates with group_by.

问题

答案1

如何将ImmutableList创建为成员变量？

为什么使用打印流打印字节（一种数值数据类型），会输出字符？

How can I create a New numpy array of matrix dimensions 1×3,1×3 and 1×1 using an already existing 2D array containing 7 elements?

结果集为空但SQL语句正确

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。