英文:
Count Number of Recurring Events until False in Impala SQL
问题
我想知道是否有一种方法来计算一行重复一个值的次数,直到它变为false,而不管它以后是否变为true。
以下是示例,显示了购买的颜色以及它们何时被使用。
当购买一种颜色时,我想跟踪它被使用的次数,直到它被更换。
规则是:
一旦购买了一个物品,他们使用新购买的物品多少次,直到他们更换。例如,对于红色,他使用了3次,然后更换,我只想统计这个次数,即使他以后再次使用红色。
绿色一开始没有被使用,然后一旦购买后就没有使用过。
示例数据
订单时间 | 最近购买的颜色 | 使用的颜色 | 期望结果 |
---|---|---|---|
1 | 红色 | 红色 | 1 |
2 | 红色 | 红色 | 2 |
3 | 红色 | 红色 | 3 |
4 | 红色 | 蓝色 | 3 |
5 | 红色 | 红色 | 3 |
1 | 绿色 | 蓝色 | 0 |
2 | 绿色 | 红色 | 0 |
3 | 绿色 | 绿色 | 0 |
提前感谢您的帮助。
我已尝试使用Last_value和lag函数。
我还尝试过将表连接起来,并在第一次列1和列2不相等时使用min()函数。
主要问题是我无法或不知道如何选择同一列的前一个值,类似于Excel中的公式在列C中,我检查列C上一行是否正确或不正确。
英文:
I would like to know if there is a way to count the number of times a row repeats a value back to back until it becomes false, regardless if it becomes true later on.
Example below is that colours bought and when they are used.
When a colour is bought, i would like to track how times it is used until it is swapped off
The rule is:
Once an item has been purchased, how many times do they use the newly purchased item until they swap off.
For Red's example, he uses it 3 times until he changes and i want to tally just that, even if he returns to it
Green was not used immediately and then has zero uses once purchased.
Example Data
Order_time | Colour Bought Most Recently | Colour Used | Desired Result |
---|---|---|---|
1 | Red | Red | 1 |
2 | Red | Red | 2 |
3 | Red | Red | 3 |
4 | Red | Blue | 3 |
5 | Red | Red | 3 |
1 | Green | Blue | 0 |
2 | Green | Red | 0 |
3 | Green | Green | 0 |
Thanks in advance
I have attempted to utilise Last_value and lag.
I have also attempted to utilise a join to the table and the min() the first time they column 1 to column 2 do not equal
the main issue is that i am unable or do not know how to select the previous value of the same column
I am thinking mostly similar to excel where the formula is in Column C and i check if Column C previous Row was correct or not.
答案1
得分: 0
以下是已翻译的内容:
你需要能够通过某种序列或时间数据对行进行排序:
with flagged as (
select *,
case when bought <> lag(bought, 1, '')
over (order by order_time) then 1 else 0 end as flag
from T
), grouped as (
select *,
sum(flag) over (order by order_time) as grp
from flagged
), tallied as (
select *,
min(case when bought <> used then 0 else 1 end)
over (partition by grp order by order_time) as tally
from grouped
)
select *,
sum(tally) over (partition by grp order by order_time)
from tallied;
也许有一种稍微更清晰的方法,但这是一种相当标准的间隙和岛屿技术。
在这里查看它的示例:https://dbfiddle.uk/TFFHRLz3
英文:
You'll need to be able to order the rows via some kind of sequence or time data:
with flagged as (
select *,
case when bought <> lag(bought, 1, '')
over (order by order_time) then 1 else 0 end as flag
from T
), grouped as (
select *,
sum(flag) over (order by order_time) as grp
from flagged
), tallied as (
select *,
min(case when bought <> used then 0 else 1 end)
over (partition by grp order by order_time) as tally
from grouped
)
select *,
sum(tally) over (partition by grp order by order_time)
from tallied;
Maybe there's a slightly cleaner way but this is a fairly standard gaps and islands technique.
See it in action here: https://dbfiddle.uk/TFFHRLz3
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论