英文:
Get max and min value's unique id while using group by
问题
我有一个包含列:unique_id, product, root_location, price
的表,其中有超过 5000 万条记录。
我希望结果是 product, min_price, min_price_unique_id, max_price, max_price_unique_id
。
我的查询:
select product
, min(price) as min_price
, max(price) as max_price
from mytable
group by product
如何获取最低价和最高价的唯一标识?
英文:
I have a table with columns: unique_id, product, root_location, price
with more than 50 million records
I want the result to be product, min_price, min_price_unique_id, max_price, max_price_unique_id
My query:
select product
, min(price) as min_price
, max(price) as max_price
from mytable
group by product
How to get the unique id's of min and max price?
答案1
得分: 2
您可以尝试使用以下代码,结合RANK
和STRING_AGG
以进行条件聚合:
SELECT product,
MIN(price) AS min_price,
STRING_AGG(CASE WHEN rn1 = 1 THEN unique_id END, ',') min_price_unique_id,
MAX(price) AS max_price,
STRING_AGG(CASE WHEN rn2 = 1 THEN unique_id END, ',') max_price_unique_id
FROM
(
SELECT *,
RANK() OVER (PARTITION BY product ORDER BY price) rn1,
RANK() OVER (PARTITION BY product ORDER BY price DESC) rn2
FROM tbl_name
) T
WHERE rn1 = 1 OR rn2 = 1
GROUP BY product
更新:
如果要获取唯一的donator_ids
值,以防存在重复值,您可以使用另一个子查询/ CTE,并使用row_number
函数按product和unique_id进行分区,然后仅获取row_number
等于1的行:
WITH CTE1 AS
(
SELECT *,
RANK() OVER (PARTITION BY product ORDER BY price) rn1,
RANK() OVER (PARTITION BY product ORDER BY price DESC) rn2
FROM tbl_name
),
CTE2 AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY product, unique_id ORDER BY unique_id) row_num
FROM CTE1
)
SELECT product,
MIN(price) AS min_price,
STRING_AGG(CASE WHEN rn1 = 1 THEN unique_id END, ',') min_price_unique_id,
MAX(price) AS max_price,
STRING_AGG(CASE WHEN rn2 = 1 THEN unique_id END, ',') max_price_unique_id
FROM CTE2
WHERE (rn1 = 1 OR rn2 = 1) AND row_num = 1
GROUP BY product
英文:
You could try using RANK
and STRING_AGG
with conditional aggregation as the following:
SELECT product,
MIN(price) AS min_price,
STRING_AGG(CASE WHEN rn1 = 1 THEN unique_id END, ',') min_price_unique_id,
MAX(price) AS max_price,
STRING_AGG(CASE WHEN rn2 = 1 THEN unique_id END, ',') max_price_unique_id
FROM
(
SELECT *,
RANK() OVER (PARTITION BY product ORDER BY price) rn1,
RANK() OVER (PARTITION BY product ORDER BY price DESC) rn2
FROM tbl_name
) T
WHERE rn1 =1 OR rn2 =1
GROUP BY product
Update:
To get the unique donator_ids
values in case duplicates have existed, you could use another subquery/ CTE, and use the row_number function partitioned by product, unique_id then get only rows where row_number =1.
WITH CTE1 AS
(
SELECT *,
RANK() OVER (PARTITION BY product ORDER BY price) rn1,
RANK() OVER (PARTITION BY product ORDER BY price DESC) rn2
FROM tbl_name
),
CTE2 AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY product, unique_id ORDER BY unique_id) row_num
FROM CTE1
)
SELECT product,
MIN(price) AS min_price,
STRING_AGG(CASE WHEN rn1 = 1 THEN unique_id END, ',') min_price_unique_id,
MAX(price) AS max_price,
STRING_AGG(CASE WHEN rn2 = 1 THEN unique_id END, ',') max_price_unique_id
FROM CTE2
WHERE (rn1 =1 OR rn2 =1) AND row_num =1
GROUP BY product
答案2
得分: 1
选择 a.*, b.unique_id 作为 min_price_unique_id, c.unique_id 作为 max_price_unique_id
从 (
选择 产品
, 最小(价格) 作为 min_price
, 最大(价格) 作为 max_price
从 我的表
分组 按 产品
) a
连接 我的表 b
在 a.min_price = b.价格
连接 我的表 c
在 a.max_price = c.价格
此查询使用两个连接来从原始表 mytable
中获取最低和最高价格的唯一标识符。子查询计算每个产品的最大和最小价格,子查询 a
与原始表 mytable
两次连接。一次用于获取最低价格的唯一标识符,然后用于获取最高价格的唯一标识符。两个连接的连接条件都是价格列。
英文:
select a.*, b.unique_id as min_price_unique_id, c.unique_id as max_price_unique_id
from (
select product
, min(price) as min_price
, max(price) as max_price
from mytable
group by product
) a
join mytable b
on a.min_price = b.price
join mytable c
on a.max_price = c.price
This query uses two joins to the original table mytable
to get the unique ids of the minimum and maximum prices. 1. The subquery calculates the max and min price for each product and the subquery a
is joined with the original table mytable
twice. Once to get the unique id of the minimum price and then to get the unique id of the maximum price. The join condition for both joins is the price column.
答案3
得分: 1
这里是你要翻译的内容:
"Depending on your requirement, use row_number()
or dense_rank()
. Do check out the documentation for the details on these 2 window functions."
"根据您的需求,可以使用 row_number()
或 dense_rank()
。请查阅文档以获取有关这两个窗口函数的详细信息。"
"Below query will return One unique_id
per product
."
"下面的查询将返回每个 product
一个 unique_id
。"
"if you do want to show all the corresponding unique_id
for the min and max price, here is one way."
"如果您确实希望显示最小和最大价格的所有相应 unique_id
,这是一种方法。"
"Use 2 separate query to identify the min and max price after that join using full outer join
(as there might be different number of unique_id
rows in the min or max price)"
"使用两个单独的查询来确定最小和最大价格,然后使用 full outer join
连接它们(因为最小价格和最大价格可能包含不同数量的 unique_id
行)"
"select mn.product, mn.min_price, mn.unique_id, mx.max_price, mx.unique_id"
"选择 mn.product, mn.min_price, mn.unique_id, mx.max_price, mx.unique_id"
"order by mn.product, mn.unique_id, mx.unique_id"
"按 mn.product、mn.unique_id 和 mx.unique_id 排序"
英文:
It really depends on what you want with the unique_id
. In the situation where there are multiple unique_id
for the min
or max
price, do you want to return all unique_id
or just only one.
Depending on your requirement, use row_number()
or dense_rank()
. Do check out the documentation for the details on these 2 window functions.
Below query will return One unique_id
per product
.
select product,
min_price = max(case when min_rn = 1 then price end),
min_price_unique_id = max(case when min_rn = 1 then unique_id end),
max_price = max(case when max_rn = 1 then price end),
max_price_unique_id = max(case when max_rn = 1 then unique_id end)
from
(
select *,
min_rn = row_number() over (partition by product order by price),
max_rn = row_number() over (partition by product order by price desc)
from mytable
) t
group by product
Edit : if you do want to show all the corresponding unique_id
for the min and max price, here is one way. Use 2 separate query to identify the min and max price after that join using full outer join
(as there might be different number of unique_id
rows in the min or max price)
with
min_price as
(
select product, min_price = price, unique_id,
rn = row_number() over (partition by product order by unique_id)
from
(
select *,
min_rn = dense_rank() over (partition by product order by price)
from mytable
) m
where min_rn = 1
),
max_price as
(
select product, max_price = price, unique_id,
rn = row_number() over (partition by product order by unique_id)
from
(
select *,
max_rn = dense_rank() over (partition by product order by price desc)
from mytable
) m
where max_rn = 1
)
select mn.product, mn.min_price, mn.unique_id, mx.max_price, mx.unique_id
from min_price mn
full outer join max_price mx on mn.product = mx.product
and mn.rn = mx.rn
order by mn.product, mn.unique_id, mx.unique_id
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论