英文:
Aggregating a column only on distinct values of another column in SQL - with some constraints
问题
假设我们有一个如下所示的表格:
ID | Cost | Color | Date |
---|---|---|---|
1 | 5 | Green | 1 |
1 | 5 | Black | 1 |
1 | 10 | Green | 2 |
1 | 10 | Black | 2 |
2 | 8 | Red | 3 |
3 | 7 | Yellow | 2 |
3 | 6 | Black | 3 |
ID 可能重复,而与每个ID关联的成本在给定日期上始终相同 - 也就是说,只有颜色不同。我想要汇总给定ID的成本,跨越日期。
通常,您可以编写类似于以下查询:
SELECT inner.ID, SUM(inner.Cost)
(SELECT ID, Cost FROM Table GROUP BY ID, Cost) as inner
GROUP BY ID;
这将返回如下结果:
ID | SUM(Cost) |
---|---|
1 | 15 |
2 | 8 |
3 | 13 |
然而,假设我们还想在同一个查询中获取颜色 - 换句话说,我们希望得到一个结果如下:
ID | SUM(Cost) | Color |
---|---|---|
1 | 15 | Green |
1 | 15 | Black |
2 | 8 | Red |
3 | 13 | Yellow |
3 | 13 | Black |
如果我们像这样从内部选择颜色:
SELECT inner.ID, SUM(inner.Cost), Color
(SELECT ID, Cost, Color FROM Table GROUP BY ID, Cost, Color) as inner
GROUP BY ID, Color;
然后,ID = 1 的 SUM(Cost) 变为 30,因为它将同时计算带有绿色和黑色的行。我没有运行列出的确切SQL,所以希望意图仍然能够通过可能存在的任何错误传达出来。总之,有没有办法在选择颜色的同时使SUM忽略额外的颜色行?
我认为在没有颜色的情况下,可以将其与表格上的ID连接以单独获取颜色 - 不幸的是,在这方面我受到了限制,因为我正在使用查询生成引擎,这将需要重大重构。我也不能在内部查询上使用DISTINCT,因为我想要的行不是重复的。有人可能会建议,数据应该首先以不同的方式布局,但这同样不在我的直接控制之内。最后,将其拆分为两个单独的查询可能也不是一个选项。鉴于这些约束条件,是否有任何我可能忽视的潜在解决方案?也许我可以在SUM上应用一些巧妙的WHERE过滤器?
英文:
Suppose we have a table like the following:
ID | Cost | Color | Date |
---|---|---|---|
1 | 5 | Green | 1 |
1 | 5 | Black | 1 |
1 | 10 | Green | 2 |
1 | 10 | Black | 2 |
2 | 8 | Red | 3 |
3 | 7 | Yellow | 2 |
3 | 6 | Black | 3 |
ID may repeat, while the cost associated with each ID on a given Date is always the same - that is, only Color differs. I would like to sum up the Cost for a given ID, across the Dates.
Ordinarily, you'd write a query similar to:
SELECT inner.ID, SUM(inner.Cost)
(SELECT ID, Cost FROM Table GROUP BY ID, Cost) as inner
GROUP BY ID;
Which would return a result like this:
ID | SUM(Cost) |
---|---|
1 | 15 |
2 | 8 |
3 | 13 |
However, suppose we want to also get the Color in the same query - in other words, we want a result like:
ID | SUM(Cost) | Color |
---|---|---|
1 | 15 | Green |
1 | 15 | Black |
2 | 8 | Red |
3 | 13 | Yellow |
3 | 13 | Black |
If we select Color from the inner select like:
SELECT inner.ID, SUM(inner.Cost), Color
(SELECT ID, Cost, Color FROM Table GROUP BY ID, Cost, Color) as inner
GROUP BY ID, Color;
The SUM(Cost) for ID = 1 then becomes 30, as it sums up the rows with both Green and Black. I haven't run the exact SQL listed, so hopefully the intent still comes through any errors that might be present. To sum it up: is there a way to select Color while having SUM ignore the extra additional Color rows?
I think the SELECT without Color could be joined with the Table on ID to pull in Color separately - unfortunately, I'm constrained in this area, working with a query generation engine and this would take significant rework. I can't really use DISTINCT on the inner query either, because the rows I want aren't duplicates. One might also suggest that the data should be laid out differently in the first place, but that's similarly outside my immediate control. Lastly, splitting it into two separate queries likely isn't an option either. Given these constraints, are there any potential solutions I've overlooked? Maybe some clever WHERE filter I can apply on the SUM?
答案1
得分: 1
以下是要翻译的内容:
一种方法是使用相关子查询:
select distinct id, color,
(select Sum(distinct cost) from t t2 where t2.id = t.id) as SumCost
from t;
英文:
A few approaches you could do, one is to use a correlated subquery:
select distinct id, color,
(select Sum(distinct cost) from t t2 where t2.id = t.id) as SumCost
from t;
答案2
得分: 0
以下是翻译好的代码部分:
with data as (
select distinct Id, Cost, Date
from T
), agg as (
select Id, sum(Cost) as AggregateCost
from data
group by Id
)
select * from T t inner join agg a
on a.Id = t.Id;
This doesn't assume that costs must be unique.
英文:
with data as (
select distinct Id, Cost, Date
from T
), agg as (
select Id, sum(Cost) as AggregateCost
from data
group by Id
)
select * from T t inner join agg a
on a.Id = t.Id;
This doesn't assume that costs must be unique.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论