SQL:按日期排序的每个类别的每个当前实例之前的实例计数。

huangapple go评论65阅读模式
英文:

SQL: Select Count Of Instances Up To Each Current Instance For Each Category In Column Ordered By Date

问题

我有一个包括日期和离散类别列(用户名)的表格。
我想要得到每个用户在表格中出现的次数,但作为表格的一列添加。
结果将指示该用户在日期列中的日期之前已经出现了多少次。

例如:

Id      Date      Username
        ------    --------
6age        1/3/23     User1
8s9a        1/3/23     User1
d29j        1/3/23     User2
fj48        1/2/23     User1
39k9        1/2/23     User3
a8j3        1/1/23     User1
0ao2        1/1/23     User1
ajd6        1/1/23     User1
am50        1/1/23     User2
amv8        1/1/23     User3

期望的输出:

Id        Date      Username    Instance
        ------    --------    --------
6age        1/3/23     User1       6
8s9a        1/3/23     User1       5
d29j        1/3/23     User2       2
fj48        1/2/23     User1       4
39k9        1/2/23     User3       2
a8j3        1/1/23     User1       3
0ao2        1/1/23     User1       2
ajd6        1/1/23     User1       1
am50        1/1/23     User2       1
amv8        1/1/23     User3       1

我正在使用SQLite。数据库有2200万行,所以我试图找到一个优化的方法。不幸的是,ID不是数字。不过,我知道可以通过row_number()来解决这个问题。

英文:

I have a table including dates and a discrete category column (username).
I'd like to get the number of times each user has showed up in the table, but added as a column to the table.
The result would indicate how many times that user has showed up in the table up to the date in the date column.

For example:

Id      Date      Username
        ------    --------
6age        1/3/23     User1
8s9a        1/3/23     User1
d29j        1/3/23     User2
fj48        1/2/23     User1
39k9        1/2/23     User3
a8j3        1/1/23     User1
0ao2        1/1/23     User1
ajd6        1/1/23     User1
am50        1/1/23     User2
amv8        1/1/23     User3

Desired output:

Id        Date      Username    Instance
        ------    --------    --------
6age        1/3/23     User1       6
8s9a        1/3/23     User1       5
d29j        1/3/23     User2       2
fj48        1/2/23     User1       4
39k9        1/2/23     User3       2
a8j3        1/1/23     User1       3
0ao2        1/1/23     User1       2
ajd6        1/1/23     User1       1
am50        1/1/23     User2       1
amv8        1/1/23     User3       1

I am working with SQLite. The database has 22 million rows, so I am trying to find an optimized approach. Sadly, the ID is not numeric. Though, I know I can overcome that with row_number().

答案1

得分: 2

一个可行的解决方案包括两个步骤:

  • 使用新字段更新表的模式
  • 在匹配的唯一标识符上进行更新,同时使用计算的行号(根据用户分区,按日期和标识符排序)
ALTER TABLE tab ADD COLUMN Instance INT;

UPDATE tab
SET Instance = cte.rn
FROM (SELECT id, 
             ROW_NUMBER() OVER(PARTITION BY Username ORDER BY Date, id DESC) AS rn
      FROM tab) cte
WHERE tab.id = cte.id;

输出

Id Date Username Instance
6age 1/3/23 User1 6
8s9a 1/3/23 User1 5
d29j 1/3/23 User2 2
fj48 1/2/23 User1 4
39k9 1/2/23 User3 2
a8j3 1/1/23 User1 2
0ao2 1/1/23 User1 3
ajd6 1/1/23 User1 1
am50 1/1/23 User2 1
amv8 1/1/23 User3 1

在此处查看演示:链接

英文:

A working solution would feature two steps:

  • update the schema of your table with the new field
  • update matching unique ids, while using a computed row-number (partitioning on users, ordering on dates and ids)
ALTER TABLE tab ADD COLUMN Instance INT;

UPDATE tab
SET Instance = cte.rn
FROM (SELECT id, 
             ROW_NUMBER() OVER(PARTITION BY Username ORDER BY Date, id DESC) AS rn
      FROM tab) cte
WHERE tab.id = cte.id;

Output:

Id Date Username Instance
6age 1/3/23 User1 6
8s9a 1/3/23 User1 5
d29j 1/3/23 User2 2
fj48 1/2/23 User1 4
39k9 1/2/23 User3 2
a8j3 1/1/23 User1 2
0ao2 1/1/23 User1 3
ajd6 1/1/23 User1 1
am50 1/1/23 User2 1
amv8 1/1/23 User3 1

Check the demo here.

huangapple
  • 本文由 发表于 2023年5月18日 00:28:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76274274.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定