英文:
How to get first top result of each group by in SQL
问题
I am using this SQL query, I want to get the first record of each group. I have used ROW_NUMBER()
function but still its showing all records on each group. Can anyone guide me through right direction to achieve my output?
我正在使用这个SQL查询,我想获取每个组的第一条记录。我已经使用了ROW_NUMBER()
函数,但仍然在每个组上显示所有记录。有人可以指导我如何正确实现我的输出吗?
Below is my sample table data, in table I want to retrieve only the first row of each group. eg in table you can see (fileintid= 160051) and 2nd (fileintid =320072) want see in the result and so on... which are the first row of each group as my row of result. I can see the issue every row_number
is showing same value in Out Put Reults data.
以下是我的示例表格数据,在表格中,我只想检索每个组的第一行。例如,在表格中,您可以看到(fileintid = 160051)和第2行(fileintid = 320072)希望在结果中看到,依此类推...这些都是每个组的第一行,作为我的结果行。我可以看到每个row_number
在输出结果数据中显示相同的值。
Out Put Results
输出结果
EntityIntID | Auditdaa | fileintid | rn |
---|---|---|---|
1 | 7/28/23 12:53 | 160051 | 1 |
1 | 7/27/23 9:49 | 380075 | 1 |
1 | 6/27/23 10:06 | 310073 | 1 |
1 | 6/27/23 9:48 | 310073 | 1 |
1 | 6/27/23 9:48 | 310073 | 1 |
1 | 6/27/23 9:46 | 310073 | 1 |
2 | 7/4/23 5:42 | 320072 | 1 |
2 | 6/27/23 11:25 | 310074 | 1 |
2 | 6/27/23 11:24 | 310074 | 1 |
2 | 6/27/23 11:23 | 140050 | 1 |
2 | 6/27/23 10:43 | 310074 | 1 |
2 | 6/27/23 10:43 | 310074 | 1 |
2 | 6/27/23 10:43 | 310074 | 1 |
2 | 6/27/23 9:44 | 310072 | 1 |
2 | 6/26/23 19:15 | 300073 | 1 |
2 | 6/26/23 19:13 | 300073 | 1 |
2 | 6/26/23 19:12 | 300073 | 1 |
2 | 6/26/23 19:09 | 120036 | 1 |
2 | 6/26/23 19:09 | 300073 | 1 |
2 | 6/26/23 19:09 | 300073 | 1 |
2 | 6/26/23 19:08 | 300073 | 1 |
2 | 6/26/23 19:08 | 120036 | 1 |
Expecting the below output with above query, Below its showing top 1st record of each row of each group by.
希望使用上面的查询得到以下输出,下面显示了每个组的每一行的第一条记录。
EntityIntID | Auditdaa | fileintid | rn |
---|---|---|---|
1 | 7/28/23 12:53 | 160051 | 1 |
2 | 7/4/23 5:42 | 320072 | 1 |
英文:
I am using this SQL query, I want to get the first record of each group. I have used ROW_NUMBER()
function but still its showing all records on each group. Can anyone guide me through right direction to achieve my output?
WITH CTE_SUM AS
(
SELECT
FA.AuditIntID, FA.FileIntID,
MAX(FA.AuditDate) AS Auditdaa,
ROW_NUMBER() OVER (PARTITION BY FA.AuditIntID ORDER BY FA.AuditIntID DESC) AS rn
FROM
DBO.Audit FA WITH (NOLOCK)
WHERE
FA.FirmIntID = 1
GROUP BY
FA.AuditIntID, FA.FileIntID
)
SELECT
CS.AuditIntID, MDC.EntityIntID, CS.Auditdaa,
CS.fileintid, cs.rn
FROM
CTE_SUM CS WITH (NOLOCK)
INNER JOIN
[DBO].[Meta] MDC ON MDC.FileIntID = CS.FileIntID
WHERE
rn = 1
ORDER BY
MDC.EntityIntID,CS.Auditdaa DESC
Below is my sample table data, in table I want to retrieve only the first row of each group. eg in table you can see (fileintid= 160051) and 2nd (fileintid =320072) want see in the result and so on... which are the first row of each group as my row of result. I can see the issue every row_number
is showing same value in Out Put Reults data.
Out Put Results
EntityIntID | Auditdaa | fileintid | rn |
---|---|---|---|
1 | 7/28/23 12:53 | 160051 | 1 |
1 | 7/27/23 9:49 | 380075 | 1 |
1 | 6/27/23 10:06 | 310073 | 1 |
1 | 6/27/23 9:48 | 310073 | 1 |
1 | 6/27/23 9:48 | 310073 | 1 |
1 | 6/27/23 9:46 | 310073 | 1 |
2 | 7/4/23 5:42 | 320072 | 1 |
2 | 6/27/23 11:25 | 310074 | 1 |
2 | 6/27/23 11:24 | 310074 | 1 |
2 | 6/27/23 11:23 | 140050 | 1 |
2 | 6/27/23 10:43 | 310074 | 1 |
2 | 6/27/23 10:43 | 310074 | 1 |
2 | 6/27/23 10:43 | 310074 | 1 |
2 | 6/27/23 9:44 | 310072 | 1 |
2 | 6/26/23 19:15 | 300073 | 1 |
2 | 6/26/23 19:13 | 300073 | 1 |
2 | 6/26/23 19:12 | 300073 | 1 |
2 | 6/26/23 19:09 | 120036 | 1 |
2 | 6/26/23 19:09 | 300073 | 1 |
2 | 6/26/23 19:09 | 300073 | 1 |
2 | 6/26/23 19:08 | 300073 | 1 |
2 | 6/26/23 19:08 | 120036 | 1 |
Expecting the below output with above query, Below its showing top 1st record of each row of each group by.
EntityIntID | Auditdaa | fileintid | rn |
---|---|---|---|
1 | 7/28/23 12:53 | 160051 | 1 |
2 | 7/4/23 5:42 | 320072 | 1 |
答案1
得分: 2
以下是您的翻译:
在您的查询中,为什么要使用行聚合(GROUP BY
/ MAX
)尚不清楚。在您的任务描述中,您没有提到任何聚合,您只是想选择特定的行,这就是您查询中的 ROW_NUMBER
的用途。
您希望每个 entityintid 得到一个结果行,因此在为行编号时,请使用按 entityintid 分区。
select entityintid, auditdate, fileintid
from
(
select
m.entityintid,
cs.auditdate,
m.fileintid,
row_number() over (partition by m.entityintid order by cs.auditintid desc) as rn
from [DBO].[Meta] m
join cte_sum cs on cs.fileintid = m.fileintid
) with_rn
where rn = 1
order by entityintid;
英文:
It is unclear why you are using row aggregation (GROUP BY
/ MAX
) in your query. In your task description you don't say anything about aggregating, you just want to pick certain rows, that's all. This is what ROW_NUMBER
in your query is for.
You want one result row per entityintid, so use a partition by entityintid when numbering your rows.
select entityintid, auditdate, fileintid
from
(
select
m.entityintid,
cs.auditdate,
m.fileintid,
row_number() over (partition by m.entityintid order by cs.auditintid desc) as rn
from [DBO].[Meta] m
join cte_sum cs on cs.fileintid = m.fileintid
) with_rn
where rn = 1
order by entityintid;
答案2
得分: 0
以下是翻译好的部分:
columns:这些是您感兴趣的实际列
join condition:这些条件首先确保了 firstrow
和 priorrow
属于同一组,并且您需要在这里进行一个反直觉但有用的检查,即 priorrow
是否在排序中早于 firstrow
。之所以进行这种反直觉的检查是因为您需要检查是否有早于 firstrow
的任何行,并且实际上第一行 firstrow
是通过在排序中找不到比 firstrow
更早的 priorrow
来识别的。这就是为什么您通过 where
子句检查异常,并且只允许每个组中有一个 firstrow
进入结果,即没有 priorrow
配对比 firstrow
更早出现在排序中的情况,这就是我们通过 where
子句检查 priorrow
是否不存在的原因。
group criterias:您需要分组的任何条件
order criterias:您希望按照的任何排序条件
英文:
A pattern for achieving what you want looks like this:
select <columns>
from yourtable firstrow
left join yourtable priorrow
on <join condition>
where priorrow.someidentifier is null
group by <group criterias>
order by <order criterias>
columns: These are the actual columns you are interested about
join condition: these conditions are ensuring first and foremost that firstrow
and priorrow
are of the same group and you need here the counter-intuitive, but useful check that priorrow
is earlier than firstrow
in your order. The reason for this counter-intuitive check is that you need to check for any rows whether there are earlier rows and firstrow
, the actual first row is recognized by the fact that you do not find a priorrow
that's earlier in your sort than firstrow
. This is why you check for exceptions and you will only allow a single firstrow
into the results per group, that which has no priorrow
pair that would be earlier in the sort, this is why we check via the where
clause that priorrow
is nonexistent
group criterias: whatever criterias you need to group by
order criterias: whatever criterias you want to sort by
答案3
得分: 0
我会按照EntityIntID
进行分区,并按Auditdaa
进行排序。
WITH cte_sum AS (
SELECT
EntityIntID,
Auditdaa,
fileintid,
ROW_NUMBER() OVER (PARTITION BY EntityIntID ORDER BY Auditdaa DESC) AS r
FROM
DBO.Audit
)
SELECT
EntityIntID,
Auditdaa,
fileintid,
FROM
cte_sum
WHERE
r=1
英文:
I would partition by EntityIntID
and order by Auditdaa
WITH cte_sum AS (
SELECT
EntityIntID,
Auditdaa,
fileintid,
ROW_NUMBER() OVER (PARTITION BY EntityIntID ORDER BY Auditdaa DESC) AS r
FROM
DBO.Audit
)
SELECT
EntityIntID,
Auditdaa,
fileintid,
FROM
cte_sum
WHERE
r=1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论