如何在SQL中对相似的行进行分组?

huangapple go评论100阅读模式
英文:

How to Group Similar Rows in Sql?

问题

  1. | 日期 | 地址 |
  2. | ------------ | --------|
  3. | 2010-09-02 - 2010-09-05 | 英国 |
  4. | 2010-09-06 - 2010-09-08 | 德国 |
  5. | 2011-09-03 - 2013-09-03 | 英国 |
英文:
DATE ADRESS
2010-09-02 ENGLAND
2010-09-03 ENGLAND
2010-09-04 ENGLAND
2010-09-05 ENGLAND
2010-09-06 GERMANY
2010-09-07 GERMANY
2010-09-08 GERMANY
2011-09-03 ENGLAND
2012-09-02 ENGLAND
2013-09-03 ENGLAND

I have a table like this and what I want to do, I want the result I want to get, how can I do it?

DATE ADRESS
2010-09-02 - 2010-09-05 ENGLAND
2010-09-06 - 2010-09-08 GERMANY
2011-09-03 - 2013-09-03 ENGLAND

答案1

得分: 3

你这里有一个“间隙和群岛”问题,你可以使用两个row_numbers之间的差异方法来解决它,尝试以下操作:

  1. select concat_ws(' - ', min([date]), max([date])) as [date],
  2. address
  3. from
  4. (
  5. select *,
  6. row_number() over (order by date) -
  7. row_number() over (partition by address order by [date]) grp
  8. from table_name
  9. ) t
  10. group by address, grp
  11. order by min([date])
英文:

You have a gaps and islands problem here, you could use the difference between two row_numbers approach to solving it, try the following:

  1. select concat_ws(' - ', min([date]), max([date])) as [date],
  2. address
  3. from
  4. (
  5. select *,
  6. row_number() over (order by date) -
  7. row_number() over (partition by address order by [date]) grp
  8. from table_name
  9. ) t
  10. group by address, grp
  11. order by min([date])

see demo

答案2

得分: 0

你可以尝试这个

SELECT CONCAT(MIN(DATE), ' - ', MAX(DATE)), ADRESS
FROM tablename GROUP BY ADRESS

英文:

you can try this

  1. SELECT CONCAT(MIN(DATE), ' - ', MAX(DATE)), ADRESS
  2. FROM tablename GROUP BY ADRESS

答案3

得分: 0

这是一个间隙和岛屿问题,以下是用于解决的另一种方法:

  1. with cte as (
  2. select *, lag(ADRESS, 1, ADRESS) over(order by id) as lag_ADRESS
  3. from ( select row_number() over() as id, t.* from mytable t) s
  4. ORDER BY id
  5. ),
  6. cte2 as (
  7. select *, sum(case when ADRESS = lag_ADRESS then 0 else 1 end) over (order by id) as grp
  8. from cte
  9. )
  10. select ADRESS, concat(min(DATE),' - ', max(Date)) as ADRESS
  11. from cte2
  12. group by ADRESS, grp

Lag() 是一个窗口函数,允许你向后查看多个行,并访问当前行的那一行的数据。

然后,我们通过 CASE WHEN 子句检查记录何时被分隔。

示例在此

英文:

As already mentioned before this is a gaps and islands problem.

This is an other way to solve it :

  1. with cte as (
  2. select *, lag(ADRESS, 1, ADRESS) over(order by id) as lag_ADRESS
  3. from ( select row_number() over() as id, t.* from mytable t) s
  4. ORDER BY id
  5. ),
  6. cte2 as (
  7. select *, sum(case when ADRESS = lag_ADRESS then 0 else 1 end) over (order by id) as grp
  8. from cte
  9. )
  10. select ADRESS, concat(min(DATE),' - ', max(Date)) as ADRESS
  11. from cte2
  12. group by ADRESS, grp

Lag() is a window function that allows you to look back a number of rows and access data of that row from the current row.

Then we check by a CASE WHEN clause where the records have been broken.

Demo here

huangapple
  • 本文由 发表于 2023年3月15日 20:25:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/75744666.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定