在Impala SQL中计算循环事件直到条件为假的次数。

huangapple go评论62阅读模式
英文:

Count Number of Recurring Events until False in Impala SQL

问题

我想知道是否有一种方法来计算一行重复一个值的次数,直到它变为false,而不管它以后是否变为true。

以下是示例,显示了购买的颜色以及它们何时被使用。
当购买一种颜色时,我想跟踪它被使用的次数,直到它被更换。

规则是:
一旦购买了一个物品,他们使用新购买的物品多少次,直到他们更换。例如,对于红色,他使用了3次,然后更换,我只想统计这个次数,即使他以后再次使用红色。
绿色一开始没有被使用,然后一旦购买后就没有使用过。

示例数据

订单时间 最近购买的颜色 使用的颜色 期望结果
1 红色 红色 1
2 红色 红色 2
3 红色 红色 3
4 红色 蓝色 3
5 红色 红色 3
1 绿色 蓝色 0
2 绿色 红色 0
3 绿色 绿色 0

提前感谢您的帮助。

我已尝试使用Last_value和lag函数。
我还尝试过将表连接起来,并在第一次列1和列2不相等时使用min()函数。
主要问题是我无法或不知道如何选择同一列的前一个值,类似于Excel中的公式在列C中,我检查列C上一行是否正确或不正确。

英文:

I would like to know if there is a way to count the number of times a row repeats a value back to back until it becomes false, regardless if it becomes true later on.

Example below is that colours bought and when they are used.
When a colour is bought, i would like to track how times it is used until it is swapped off

The rule is:
Once an item has been purchased, how many times do they use the newly purchased item until they swap off.
For Red's example, he uses it 3 times until he changes and i want to tally just that, even if he returns to it
Green was not used immediately and then has zero uses once purchased.

Example Data

Order_time Colour Bought Most Recently Colour Used Desired Result
1 Red Red 1
2 Red Red 2
3 Red Red 3
4 Red Blue 3
5 Red Red 3
1 Green Blue 0
2 Green Red 0
3 Green Green 0

Thanks in advance

I have attempted to utilise Last_value and lag.
I have also attempted to utilise a join to the table and the min() the first time they column 1 to column 2 do not equal
the main issue is that i am unable or do not know how to select the previous value of the same column

I am thinking mostly similar to excel where the formula is in Column C and i check if Column C previous Row was correct or not.

答案1

得分: 0

以下是已翻译的内容:

你需要能够通过某种序列或时间数据对行进行排序:

with flagged as (
    select *,
        case when bought <> lag(bought, 1, '') 
            over (order by order_time) then 1 else 0 end as flag
    from T
), grouped as (
    select *,
        sum(flag) over (order by order_time) as grp
    from flagged
), tallied as (
    select *,
        min(case when bought <> used then 0 else 1 end)
            over (partition by grp order by order_time) as tally
    from grouped
)
select *,
    sum(tally) over (partition by grp order by order_time)
from tallied;

也许有一种稍微更清晰的方法,但这是一种相当标准的间隙和岛屿技术。

在这里查看它的示例:https://dbfiddle.uk/TFFHRLz3

英文:

You'll need to be able to order the rows via some kind of sequence or time data:

with flagged as (
    select *,
        case when bought &lt;&gt; lag(bought, 1, &#39;&#39;)
            over (order by order_time) then 1 else 0 end as flag
    from T
), grouped as (
    select *,
        sum(flag) over (order by order_time) as grp
    from flagged
), tallied as (
    select *,
        min(case when bought &lt;&gt; used then 0 else 1 end)
            over (partition by grp order by order_time) as tally
    from grouped
)
select *,
    sum(tally) over (partition by grp order by order_time)
from tallied;

Maybe there's a slightly cleaner way but this is a fairly standard gaps and islands technique.

See it in action here: https://dbfiddle.uk/TFFHRLz3

huangapple
  • 本文由 发表于 2023年7月11日 00:39:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76655726.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定