英文:
Retrieve min and max store number that have the same location
问题
我们的“store”表大致如下:
store_id | city_id | store_type |
---|---|---|
1 | 1 | regular |
2 | 1 | regular |
3 | 1 | regular |
50 | 1 | regular |
51 | 1 | express |
55 | 1 | express |
58 | 1 | express |
70 | 1 | express |
71 | 2 | regular |
75 | 2 | regular |
78 | 2 | regular |
80 | 2 | regular |
85 | 2 | regular |
90 | 2 | regular |
91 | 1 | regular |
95 | 1 | regular |
97 | 1 | regular |
100 | 1 | regular |
105 | 1 | regular |
我想在SQL Server中创建一个列表,其中两列具有相同的值,因此在我们的store表中,可以执行类似以下的选择:
min_store_id | max_store_id | city_id | store_type |
---|---|---|---|
1 | 50 | 1 | regular |
51 | 70 | 1 | express |
71 | 90 | 2 | regular |
91 | 105 | 1 | regular |
然而,问题在于,我们有一个糟糕的store_id系统,因此最后一行和第一行具有相同的值是会在表中发生的情况,而且遗憾的是我们无法更改它。
尝试了类似以下的查询:
SELECT MIN(store_id) OVER (PARTITION BY city_id, store_type) AS min_store_id,
MAX(store_id) OVER (PARTITION BY city_id, store_type) AS max_store_id,
city_id,
store_type
FROM store;
但这不起作用。
英文:
Our "store" table looks something like this:
store_id | city_id | store_type |
---|---|---|
1 | 1 | regular |
2 | 1 | regular |
3 | 1 | regular |
50 | 1 | regular |
51 | 1 | express |
55 | 1 | express |
58 | 1 | express |
70 | 1 | express |
71 | 2 | regular |
75 | 2 | regular |
78 | 2 | regular |
80 | 2 | regular |
85 | 2 | regular |
90 | 2 | regular |
91 | 1 | regular |
95 | 1 | regular |
97 | 1 | regular |
100 | 1 | regular |
105 | 1 | regular |
I want to create a list in SQL Server that have the same value on 2 columns so in our table store we can make a select that looks something like:
min_store_id | max_store_id | city_id | store_type |
---|---|---|---|
1 | 50 | 1 | regular |
51 | 70 | 1 | express |
71 | 90 | 2 | regular |
91 | 105 | 1 | regular |
However the problem is, we hade a bad store_id system so the fact that the last and first row do have the same value is something that will happen in the table, and we sadly cannot change it.
Tried with something like this:
SELECT MIN(store_id) OVER (PARTITION BY city_id, store_type) AS min_store_id,
MAX(store_id) OVER (PARTITION BY city_id, store_type) AS max_store_id,
city_id,
store_type
FROM store;
but it does not work at all.
答案1
得分: 2
这里涉及到一个“缺口和岛屿”问题,你可以使用两个row_numbers
之间的差值来创建分组:
with cte as (
select *, row_number() over (order by store_id)
- row_number() over (partition by city_id, store_type order by store_id) as grp
from store
)
SELECT MIN(store_id) AS min_store_id,
MAX(store_id) AS max_store_id,
max(city_id) as city_id,
max(store_type) as store_type
FROM cte
group by grp
结果:
min_store_id max_store_id city_id store_type
1 50 1 regular
51 70 1 express
71 90 2 regular
91 105 1 regular
英文:
You have a gaps and islands problem here, you could use the difference between two row_numbers approach to create groups :
with cte as (
select *, row_number() over (order by store_id)
- row_number() over (partition by city_id, store_type order by store_id) as grp
from store
)
SELECT MIN(store_id) AS min_store_id,
MAX(store_id) AS max_store_id,
max(city_id) as city_id,
max(store_type) as store_type
FROM cte
group by grp
Result :
min_store_id max_store_id city_id store_type
1 50 1 regular
51 70 1 express
71 90 2 regular
91 105 1 regular
答案2
得分: 2
另一个答案更简洁,但由于我花了时间,我还是要发布。
如前所述,这是一个间隙和岛屿的问题,你需要一种标记要分析的每个组的方法。
在SQL中的解释。
创建表格测试(store_id int,city_id int,store_type varchar(12));
插入到测试(store_id,city_id,store_type)中
值
(1,1,'常规'),
(2,1,'常规'),
(3,1,'常规'),
(50,1,'常规'),
(51,1,'快递'),
(55,1,'快递'),
(58,1,'快递'),
(70,1,'快递'),
(71,2,'常规'),
(75,2,'常规'),
(78,2,'常规'),
(80,2,'常规'),
(85,2,'常规'),
(90,2,'常规'),
(91,1,'常规'),
(95,1,'常规'),
(97,1,'常规'),
(100,1,'常规'),
(105,1,'常规');
用cte1作为(
选择*
- 如果store_type或city_id在order by store_id时发生变化,则是新的组[
或城市ID <> lag(city_id,1,city_id)over(order by store_id)时,store_type <> lag(store_type,1,store_type)的情况时,1否则为0 end Transition
从测试
),cte2作为(
选择*
- 对转换求和以提供唯一的组ID
,sum(Transition)over(order by store_id)[Grouping]
从cte1中选择
)
选择
- 计算每个组的所需结果
最小(store_id)min_store_id
,最大(store_id)max_store_id
,city_id
,store_type
从cte2中选择
组别
city_id
,store_type
,[Grouping]
按min_store_id排序;
返回
min_store_id | max_store_id | city_id | store_type |
---|---|---|---|
1 | 50 | 1 | regular |
51 | 70 | 1 | express |
71 | 90 | 2 | regular |
91 | 105 | 1 | regular |
注意:提供如此所示的DDL+DML使人们更容易回答。
英文:
The other answer is much cleaner, but since I put in the time I'm posting anyway.
As already stated its a gaps-and-islands problem, you need a way to mark each group you want to analyse.
Explanation in the SQL.
create table Test (store_id int, city_id int, store_type varchar(12));
insert into Test (store_id, city_id, store_type)
values
(1, 1, 'regular'),
(2, 1, 'regular'),
(3, 1, 'regular'),
(50, 1, 'regular'),
(51, 1, 'express'),
(55, 1, 'express'),
(58, 1, 'express'),
(70, 1, 'express'),
(71, 2, 'regular'),
(75, 2, 'regular'),
(78, 2, 'regular'),
(80, 2, 'regular'),
(85, 2, 'regular'),
(90, 2, 'regular'),
(91, 1, 'regular'),
(95, 1, 'regular'),
(97, 1, 'regular'),
(100, 1, 'regular'),
(105, 1, 'regular');
with cte1 as (
select *
-- If the store_type or the city_id changes its a new group[
, case when store_type <> lag(store_type, 1, store_type) over (order by store_id)
or city_id <> lag(city_id, 1, city_id) over (order by store_id) then 1 else 0 end Transition
from Test
), cte2 as (
select *
-- Sum the transitions to provide a unique group id
, sum(Transition) over (order by store_id) [Grouping]
from cte1
)
select
-- Calculate desired results per group
min(store_id) min_store_id
, max(store_id) max_store_id
, city_id
, store_type
from cte2
group by
city_id
, store_type
, [Grouping]
order by min_store_id;
Returns
min_store_id | max_store_id | city_id store_type | |
---|---|---|---|
1 | 50 | 1 | regular |
51 | 70 | 1 | express |
71 | 90 | 2 | regular |
91 | 105 | 1 | regular |
Note: Providing the DDL+DML as shown here makes it much easier for people to answer.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论