返回最新日期的所有结果。

huangapple go评论60阅读模式
英文:

Return all the results for the latest date

问题

我有一个表格,其中包含以下列:

  • city_id = 城市ID
  • city_name = 城市名称
  • shape = 多边形 -> 地理字段
  • date = 形状创建的日期

并且用以下示例数据填充:

city_id city_name shape created_date
1 abcd POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) 2023-04-24
1 abcd POLYGON((12.967481 103.314121, 12.947563293673372 103.3669285559756, 12.90735564458858 103.32701845345733, 12.962686563318833 103.30510874799613, 12.967481 103.314121)) 2023-04-24
1 abcd POLYGON((14.967481 103.314121, 14.947563293673372 103.3669285559756, 14.90735564458858 103.32701845345733, 14.962686563318833 103.30510874799613, 14.967481 103.314121)) 2023-04-24
2 efgh POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) 2023-04-11
2 efgh POLYGON((8.967481 103.314121, 8.947563293673372 103.3669285559756, 8.90735564458858 103.32701845345733, 8.962686563318833 103.30510874799613, 8.967481 103.314121)) 2023-04-11
2 efgh POLYGON((22.967481 103.314121, 22.947563293673372 103.3669285559756, 22.90735564458858 103.32701845345733, 22.962686563318833 103.30510874799613, 22.967481 103.314121)) 2023-03-23
3 ijkl POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) 2023-04-10
3 ijkl POLYGON((28.967481 103.314121, 28.947563293673372 103.3669285559756, 28.90735564458858 103.32701845345733, 28.962686563318833 103.30510874799613, 28.967481 103.314121)) 2023-03-03

我已经编写了以下查询:

SELECT city_id, 
       city_name, 
       shape,
       date
FROM (SELECT city_id, 
             city_name, 
             ANY_VALUE(shape) AS shape, 
             date, 
             ROW_NUMBER() OVER (PARTITION BY city_id, city_name ORDER BY date DESC) AS rank
      FROM test
      GROUP BY 1,2,4,5)
WHERE rank = 1
ORDER BY date DESC

当前结果:

city_id city_name shape created_date
1 abcd POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) 2023-04-24
2 efgh POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) 2023-04-11
3 ijkl POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) 2023-04-10

使用这个查询,我希望使用PARTITION函数和rank = 1来返回每个城市在同一天创建了多个形状的最新日期的所有记录。该查询对只有一个形状在最新日期创建的城市运行正常。

期望的输出:

city_id city_name shape created_date
1 abcd POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) 2023-04-24
1 abcd POLYGON((12.967481 103.314121, 12.947563293673372 103.3669285559756, 12.90735564458858 103.32701845345733, 12.962686563318833 103.30510874799613, 12.967481 103.314121)) 2023-04-24
1 abcd POLYGON((
英文:

I have a table with the following columns:

  • city_id = city ID
  • city_name = city name
  • shape = polygon -> geography field
  • date = date when the shape was created

and populated with the following sample data:

city_id city_name shape created_date
1 abcd POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) 2023-04-24
1 abcd POLYGON((12.967481 103.314121, 12.947563293673372 103.3669285559756, 12.90735564458858 103.32701845345733, 12.962686563318833 103.30510874799613, 12.967481 103.314121)) 2023-04-24
1 abcd POLYGON((14.967481 103.314121, 14.947563293673372 103.3669285559756, 14.90735564458858 103.32701845345733, 14.962686563318833 103.30510874799613, 14.967481 103.314121)) 2023-04-24
2 efgh POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) 2023-04-11
2 efgh POLYGON((8.967481 103.314121, 8.947563293673372 103.3669285559756, 8.90735564458858 103.32701845345733, 8.962686563318833 103.30510874799613, 8.967481 103.314121)) 2023-04-11
2 efgh POLYGON((22.967481 103.314121, 22.947563293673372 103.3669285559756, 22.90735564458858 103.32701845345733, 22.962686563318833 103.30510874799613, 22.967481 103.314121)) 2023-03-23
3 ijkl POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) 2023-04-10
3 ijkl POLYGON((28.967481 103.314121, 28.947563293673372 103.3669285559756, 28.90735564458858 103.32701845345733, 28.962686563318833 103.30510874799613, 28.967481 103.314121)) 2023-03-03

I've crafted the following query:

SELECT city_id, 
       city_name, 
       shape,
       date
FROM (SELECT city_id, 
             city_name, 
             ANY_VALUE(shape) AS shape, 
             date, 
             ROW_NUMBER() OVER (PARTITION BY city_id, city_name ORDER BY date DESC) AS rank
      FROM test
      GROUP BY 1,2,4,5)
WHERE rank = 1
ORDER BY date DESC

Current result:

city_id city_name shape created_date
1 abcd POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) 2023-04-24
2 efgh POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) 2023-04-11
3 ijkl POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) 2023-04-10

With this query, I was hoping that using the PARTITION function and rank = 1 would return all the records for a city where several shapes have been created the same day, for the latest date. The query works fine for cities with only 1 shape created on the latest date.

Expected output:

city_id city_name shape created_date
1 abcd POLYGON((18.967481 103.314121, 18.947563293673372 103.3669285559756, 18.90735564458858 103.32701845345733, 18.962686563318833 103.30510874799613, 18.967481 103.314121)) 2023-04-24
1 abcd POLYGON((12.967481 103.314121, 12.947563293673372 103.3669285559756, 12.90735564458858 103.32701845345733, 12.962686563318833 103.30510874799613, 12.967481 103.314121)) 2023-04-24
1 abcd POLYGON((14.967481 103.314121, 14.947563293673372 103.3669285559756, 14.90735564458858 103.32701845345733, 14.962686563318833 103.30510874799613, 14.967481 103.314121)) 2023-04-24
2 efgh POLYGON((16.967481 103.314121, 16.947563293673372 103.3669285559756, 16.90735564458858 103.32701845345733, 16.962686563318833 103.30510874799613, 16.967481 103.314121)) 2023-04-11
2 efgh POLYGON((8.967481 103.314121, 8.947563293673372 103.3669285559756, 8.90735564458858 103.32701845345733, 8.962686563318833 103.30510874799613, 8.967481 103.314121)) 2023-04-11
3 ijkl POLYGON((24.967481 103.314121, 24.947563293673372 103.3669285559756, 24.90735564458858 103.32701845345733, 24.962686563318833 103.30510874799613, 24.967481 103.314121)) 2023-04-10

Any idea on how to return all the shapes created on the latest date for each city?

Thank you in advance!

答案1

得分: 1

DENSE_RANK窗口函数在这里更合适。它的近亲ROW_NUMBER只允许唯一的排名值,而不考虑并列。

SELECT city_id, city_name, shape, date
FROM (SELECT city_id, city_name, shape, date, 
             DENSE_RANK() OVER (PARTITION BY city_id ORDER BY date DESC) AS rn
      FROM test) cte
WHERE rn = 1
ORDER BY date DESC

此外,您可以删除:

  • 子查询中的整个GROUP BY子句
  • PARTITION BY子句中的"city_name",假设您的"city_id"值唯一标识城市。
英文:

The DENSE_RANK window function is better used here. It's cousin ROW_NUMBER allows unique ranking values only, regardless of ties.

SELECT city_id, city_name, shape, date
FROM (SELECT city_id, city_name, shape, date, 
             DENSE_RANK() OVER (PARTITION BY city_id ORDER BY date DESC) AS rn
      FROM test) cte
WHERE rn = 1
ORDER BY date DESC

Also you can remove:

  • the whole GROUP BY clause from the subquery

  • "city_name" from the PARTITION BY clause, assuming that your "city_id" values uniquely identify cities.

答案2

得分: 0

如果你想获取每个城市最后一天的记录,请尝试以下代码:

SELECT test.city_id, 
       test.city_name, 
       ANY_VALUE(shape) AS shape, 
       test.date
FROM test INNER JOIN (SELECT city_id, 
                     city_name, 
                     MAX(date) AS MAX_DATE
              FROM test
              GROUP BY city_id, city_name) m ON test.city_id = m.city_id AND test.date = m.MAX_DATE
ORDER BY date DESC
英文:

If you want the records of the last date for each city thin try this.

SELECT test.city_id, 
       test.city_name, 
       ANY_VALUE(shape) AS shape, 
       test.date
FROM test inner join (SELECT city_id, 
             city_name, 
             MAX(date) AS MAX_DATE
      FROM test
	  GROUP BY city_id, city_name) m ON test.city_id = m.city_id AND test.date = m.MAX_DATE
ORDER BY date DESC

huangapple
  • 本文由 发表于 2023年6月2日 05:31:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/76385840.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定