选择最近3个月中出现最多的值,或者选择终身中出现最多的值。

huangapple go评论55阅读模式
英文:

Select the most occuring value for the last 3 months or the most occuring value for the life time

问题

LifestyleProfile 表中的数据如下:

用户ID 月份 每日负载类型
1 2023-06-01 晚上晚加载
1 2023-05-01 晚上晚加载
1 2023-04-01 晚上晚加载
1 2023-03-01 晚上晚加载
1 2023-02-01 白天加载
1 2023-01-01 白天加载
2 2023-06-01 晚上晚加载
2 2023-05-01 白天加载
2 2023-04-01 白天加载
2 2023-03-01 晚上晚加载
2 2023-02-01 白天加载
2 2023-01-01 白天加载

预期输出如下:

用户ID 每日负载类型
1 晚上晚加载
2 每日加载

此计算背后的逻辑是,如果用户在最近的3个月内拥有相同的 每日负载类型,则考虑该值;否则,考虑该用户中出现次数最多的 每日负载类型 值。

英文:

The data in the LifestyleProfile table looks like this:

UserId Month dailyloadtype
1 2023-06-01 LATE_EVE
1 2023-05-01 LATE_EVE
1 2023-04-01 LATE_EVE
1 2023-03-01 LATE_EVE
1 2023-02-01 DAY_LOAD
1 2023-01-01 DAY_LOAD
2 2023-06-01 LATE_EVE
2 2023-05-01 DAY_LOAD
2 2023-04-01 DAY_LOAD
2 2023-03-01 LATE_EVE
2 2023-02-01 DAY_LOAD
2 2023-01-01 DAY_LOAD

The expected output is this:

UserId dailyloadtype
1 LATE_EVE
2 DAILY_LOAD

The logic behind this calculation is that if the user has a same dailyloadtype for the last 3 latest months then consider that value, else consider the most occurring dailyloadtype value for that user.

答案1

得分: 2

以下是翻译好的部分:

以下的`Cte`与最近3个月的数据相关。`Cte2`与用户最常出现的数据相关。如果有最近3个月的数据,请考虑`dailyloadtype``cte`,否则为`cte2``dailyloadtype`如下。

declare @lastmonthStart date=DATEADD(M,-1,DATEADD(d,-day(getdate()-1),GETDATE()));
with Cte as(
  select LP.UserId ,LP.dailyloadtype from LifestyleProfile LP where LP.Month>=DATEADD(M,-2,@lastmonthStart) and LP.Month<=@lastmonthStart
  group by LP.UserId ,LP.dailyloadtype
  having COUNT(LP.UserId)=3
),cte2 as(
  select LP.UserId ,LP.dailyloadtype,row_number() over(partition by LP.UserId 
  order by COUNT(LP.UserId) desc ) RwNUm
  from LifestyleProfile LP
  group by LP.UserId ,LP.dailyloadtype
)
  select cte2.UserId,coalesce(cte2.dailyloadtype,cte.dailyloadtype) from cte2 left join Cte on Cte.UserId=cte2.UserId where cte2.RwNUm=1
英文:

Following Cte is related to last 3 months Data. Cte2 is Related to most Occurred data of Users. if last 3 months data is there considering cte dailyloadtype else cte2 dailyloadtype as follows.

declare @lastmonthStart date=DATEADD(M,-1,DATEADD(d,-day(getdate()-1),GETDATE()));
with Cte as(
  select LP.UserId ,LP.dailyloadtype from LifestyleProfile LP where LP.Month&gt;=DATEADD(M,-2,@lastmonthStart) and LP.Month&lt;=@lastmonthStart
  group by LP.UserId ,LP.dailyloadtype
  having COUNT(LP.UserId)=3
),cte2 as(
  select LP.UserId ,LP.dailyloadtype,row_number() over(partition by LP.UserId 
  order by COUNT(LP.UserId) desc ) RwNUm
  from LifestyleProfile LP
  group by LP.UserId ,LP.dailyloadtype
)
  select cte2.UserId,coalesce(cte2.dailyloadtype,cte.dailyloadtype) from cte2 left join Cte on Cte.UserId=cte2.UserId where cte2.RwNUm=1

答案2

得分: 1

我不知道你的日期范围在下个月滚动时应该如何工作,等等。这是一个你应该容易解决的细节。

select UserId, mode(DailyLoadType) as DailyLoadType
from LifestyleProfile
where Month > dateadd(month, -4, currentdate)
group by UserId;

如果所有值对于每个用户都相同,那么mode()将返回相同的值。如果只有两个可能的值,那么在行数为奇数的情况下就不会有平局。如果需要处理平局,那么可以修改问题。

Snowflake MODE 函数文档

英文:

I don't know how your date ranges are supposed to work when the next month rolls over, etc. That's a detail that should be easy for you to work out.

select UserId, mode(DailyLoadType) as DailyLoadType
from LifestyleProfile
where Month &gt; dateadd(month, -4, currentdate)
group by UserId;

If all the values are the same, per user, then mode() will return that same value. If there are only two possible values then there can't be a tie with an odd number of rows. Should the handling of ties become necessary then you can modify the question.

https://docs.snowflake.com/en/sql-reference/functions/mode

huangapple
  • 本文由 发表于 2023年7月10日 16:12:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/76651877.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定