英文:
Return all the results for the latest date
问题
我有一个表格,其中包含以下列:
city_id
= 城市IDcity_name
= 城市名称shape
= 多边形 -> 地理字段date
= 形状创建的日期
并且用以下示例数据填充:
city_id | city_name | shape | created_date |
---|---|---|---|
1 | abcd | POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) | 2023-04-24 |
1 | abcd | POLYGON((12.967481 103.314121, 12.947563293673372 103.3669285559756, 12.90735564458858 103.32701845345733, 12.962686563318833 103.30510874799613, 12.967481 103.314121)) | 2023-04-24 |
1 | abcd | POLYGON((14.967481 103.314121, 14.947563293673372 103.3669285559756, 14.90735564458858 103.32701845345733, 14.962686563318833 103.30510874799613, 14.967481 103.314121)) | 2023-04-24 |
2 | efgh | POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) | 2023-04-11 |
2 | efgh | POLYGON((8.967481 103.314121, 8.947563293673372 103.3669285559756, 8.90735564458858 103.32701845345733, 8.962686563318833 103.30510874799613, 8.967481 103.314121)) | 2023-04-11 |
2 | efgh | POLYGON((22.967481 103.314121, 22.947563293673372 103.3669285559756, 22.90735564458858 103.32701845345733, 22.962686563318833 103.30510874799613, 22.967481 103.314121)) | 2023-03-23 |
3 | ijkl | POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) | 2023-04-10 |
3 | ijkl | POLYGON((28.967481 103.314121, 28.947563293673372 103.3669285559756, 28.90735564458858 103.32701845345733, 28.962686563318833 103.30510874799613, 28.967481 103.314121)) | 2023-03-03 |
我已经编写了以下查询:
SELECT city_id,
city_name,
shape,
date
FROM (SELECT city_id,
city_name,
ANY_VALUE(shape) AS shape,
date,
ROW_NUMBER() OVER (PARTITION BY city_id, city_name ORDER BY date DESC) AS rank
FROM test
GROUP BY 1,2,4,5)
WHERE rank = 1
ORDER BY date DESC
当前结果:
city_id | city_name | shape | created_date |
---|---|---|---|
1 | abcd | POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) | 2023-04-24 |
2 | efgh | POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) | 2023-04-11 |
3 | ijkl | POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) | 2023-04-10 |
使用这个查询,我希望使用PARTITION
函数和rank = 1
来返回每个城市在同一天创建了多个形状的最新日期的所有记录。该查询对只有一个形状在最新日期创建的城市运行正常。
期望的输出:
city_id | city_name | shape | created_date |
---|---|---|---|
1 | abcd | POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) | 2023-04-24 |
1 | abcd | POLYGON((12.967481 103.314121, 12.947563293673372 103.3669285559756, 12.90735564458858 103.32701845345733, 12.962686563318833 103.30510874799613, 12.967481 103.314121)) | 2023-04-24 |
1 | abcd | POLYGON(( |
英文:
I have a table with the following columns:
city_id
= city IDcity_name
= city nameshape
= polygon -> geography fielddate
= date when the shape was created
and populated with the following sample data:
city_id | city_name | shape | created_date |
---|---|---|---|
1 | abcd | POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) | 2023-04-24 |
1 | abcd | POLYGON((12.967481 103.314121, 12.947563293673372 103.3669285559756, 12.90735564458858 103.32701845345733, 12.962686563318833 103.30510874799613, 12.967481 103.314121)) | 2023-04-24 |
1 | abcd | POLYGON((14.967481 103.314121, 14.947563293673372 103.3669285559756, 14.90735564458858 103.32701845345733, 14.962686563318833 103.30510874799613, 14.967481 103.314121)) | 2023-04-24 |
2 | efgh | POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) | 2023-04-11 |
2 | efgh | POLYGON((8.967481 103.314121, 8.947563293673372 103.3669285559756, 8.90735564458858 103.32701845345733, 8.962686563318833 103.30510874799613, 8.967481 103.314121)) | 2023-04-11 |
2 | efgh | POLYGON((22.967481 103.314121, 22.947563293673372 103.3669285559756, 22.90735564458858 103.32701845345733, 22.962686563318833 103.30510874799613, 22.967481 103.314121)) | 2023-03-23 |
3 | ijkl | POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) | 2023-04-10 |
3 | ijkl | POLYGON((28.967481 103.314121, 28.947563293673372 103.3669285559756, 28.90735564458858 103.32701845345733, 28.962686563318833 103.30510874799613, 28.967481 103.314121)) | 2023-03-03 |
I've crafted the following query:
SELECT city_id,
city_name,
shape,
date
FROM (SELECT city_id,
city_name,
ANY_VALUE(shape) AS shape,
date,
ROW_NUMBER() OVER (PARTITION BY city_id, city_name ORDER BY date DESC) AS rank
FROM test
GROUP BY 1,2,4,5)
WHERE rank = 1
ORDER BY date DESC
Current result:
city_id | city_name | shape | created_date |
---|---|---|---|
1 | abcd | POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) | 2023-04-24 |
2 | efgh | POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) | 2023-04-11 |
3 | ijkl | POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) | 2023-04-10 |
With this query, I was hoping that using the PARTITION
function and rank = 1
would return all the records for a city where several shapes have been created the same day, for the latest date. The query works fine for cities with only 1 shape created on the latest date.
Expected output:
city_id | city_name | shape | created_date |
---|---|---|---|
1 | abcd | POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) | 2023-04-24 |
1 | abcd | POLYGON((12.967481 103.314121, 12.947563293673372 103.3669285559756, 12.90735564458858 103.32701845345733, 12.962686563318833 103.30510874799613, 12.967481 103.314121)) | 2023-04-24 |
1 | abcd | POLYGON((14.967481 103.314121, 14.947563293673372 103.3669285559756, 14.90735564458858 103.32701845345733, 14.962686563318833 103.30510874799613, 14.967481 103.314121)) | 2023-04-24 |
2 | efgh | POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) | 2023-04-11 |
2 | efgh | POLYGON((8.967481 103.314121, 8.947563293673372 103.3669285559756, 8.90735564458858 103.32701845345733, 8.962686563318833 103.30510874799613, 8.967481 103.314121)) | 2023-04-11 |
3 | ijkl | POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) | 2023-04-10 |
Any idea on how to return all the shapes created on the latest date for each city?
Thank you in advance!
答案1
得分: 1
DENSE_RANK
窗口函数在这里更合适。它的近亲ROW_NUMBER
只允许唯一的排名值,而不考虑并列。
SELECT city_id, city_name, shape, date
FROM (SELECT city_id, city_name, shape, date,
DENSE_RANK() OVER (PARTITION BY city_id ORDER BY date DESC) AS rn
FROM test) cte
WHERE rn = 1
ORDER BY date DESC
此外,您可以删除:
- 子查询中的整个
GROUP BY
子句 PARTITION BY
子句中的"city_name",假设您的"city_id"值唯一标识城市。
英文:
The DENSE_RANK
window function is better used here. It's cousin ROW_NUMBER
allows unique ranking values only, regardless of ties.
SELECT city_id, city_name, shape, date
FROM (SELECT city_id, city_name, shape, date,
DENSE_RANK() OVER (PARTITION BY city_id ORDER BY date DESC) AS rn
FROM test) cte
WHERE rn = 1
ORDER BY date DESC
Also you can remove:
-
the whole
GROUP BY
clause from the subquery -
"city_name" from the
PARTITION BY
clause, assuming that your "city_id" values uniquely identify cities.
答案2
得分: 0
如果你想获取每个城市最后一天的记录,请尝试以下代码:
SELECT test.city_id,
test.city_name,
ANY_VALUE(shape) AS shape,
test.date
FROM test INNER JOIN (SELECT city_id,
city_name,
MAX(date) AS MAX_DATE
FROM test
GROUP BY city_id, city_name) m ON test.city_id = m.city_id AND test.date = m.MAX_DATE
ORDER BY date DESC
英文:
If you want the records of the last date for each city thin try this.
SELECT test.city_id,
test.city_name,
ANY_VALUE(shape) AS shape,
test.date
FROM test inner join (SELECT city_id,
city_name,
MAX(date) AS MAX_DATE
FROM test
GROUP BY city_id, city_name) m ON test.city_id = m.city_id AND test.date = m.MAX_DATE
ORDER BY date DESC
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论