如何在SQL中对相似的行进行分组?

huangapple go评论71阅读模式
英文:

How to Group Similar Rows in Sql?

问题

| 日期         | 地址     |
| ------------ | --------|
| 2010-09-02 - 2010-09-05 | 英国     |
| 2010-09-06 - 2010-09-08 | 德国     |
| 2011-09-03 - 2013-09-03 | 英国     |
英文:
DATE ADRESS
2010-09-02 ENGLAND
2010-09-03 ENGLAND
2010-09-04 ENGLAND
2010-09-05 ENGLAND
2010-09-06 GERMANY
2010-09-07 GERMANY
2010-09-08 GERMANY
2011-09-03 ENGLAND
2012-09-02 ENGLAND
2013-09-03 ENGLAND

I have a table like this and what I want to do, I want the result I want to get, how can I do it?

DATE ADRESS
2010-09-02 - 2010-09-05 ENGLAND
2010-09-06 - 2010-09-08 GERMANY
2011-09-03 - 2013-09-03 ENGLAND

答案1

得分: 3

你这里有一个“间隙和群岛”问题,你可以使用两个row_numbers之间的差异方法来解决它,尝试以下操作:

select concat_ws(' - ', min([date]), max([date])) as [date],
       address
from
(
  select *,
         row_number() over (order by date) -
         row_number() over (partition by address order by [date]) grp
  from table_name
) t
group by address, grp
order by min([date])
英文:

You have a gaps and islands problem here, you could use the difference between two row_numbers approach to solving it, try the following:

select concat_ws(' - ', min([date]), max([date])) as [date],
       address
from
(
  select *,
   row_number() over (order by date) -
   row_number() over (partition by address order by [date]) grp
  from table_name
) t
group by address, grp
order by min([date])

see demo

答案2

得分: 0

你可以尝试这个

SELECT CONCAT(MIN(DATE), ' - ', MAX(DATE)), ADRESS
FROM tablename GROUP BY ADRESS

英文:

you can try this

SELECT CONCAT(MIN(DATE), ' - ', MAX(DATE)), ADRESS
    FROM tablename GROUP BY ADRESS

答案3

得分: 0

这是一个间隙和岛屿问题,以下是用于解决的另一种方法:

with cte as (
  select *, lag(ADRESS, 1, ADRESS) over(order by id) as lag_ADRESS
  from ( select row_number() over() as id, t.* from mytable t) s
  ORDER BY id
),
cte2 as (
  select *, sum(case when ADRESS = lag_ADRESS then 0 else 1 end) over (order by id) as grp
  from cte
)
select ADRESS, concat(min(DATE),' - ', max(Date)) as ADRESS
from cte2
group by ADRESS, grp

Lag() 是一个窗口函数,允许你向后查看多个行,并访问当前行的那一行的数据。

然后,我们通过 CASE WHEN 子句检查记录何时被分隔。

示例在此

英文:

As already mentioned before this is a gaps and islands problem.

This is an other way to solve it :

with cte as (
  select *, lag(ADRESS, 1, ADRESS) over(order by id) as lag_ADRESS
  from ( select row_number() over() as id, t.* from mytable t) s
  ORDER BY id
),
cte2 as (
  select *, sum(case when ADRESS = lag_ADRESS then 0 else 1 end) over (order by id) as grp
  from cte
)
select ADRESS, concat(min(DATE),' - ', max(Date)) as ADRESS
from cte2
group by ADRESS, grp

Lag() is a window function that allows you to look back a number of rows and access data of that row from the current row.

Then we check by a CASE WHEN clause where the records have been broken.

Demo here

huangapple
  • 本文由 发表于 2023年3月15日 20:25:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/75744666.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定