在Pandas中计算行之间的差异

huangapple go评论78阅读模式
英文:

Calculate difference of rows in Pandas

问题

我有一个时间序列的数据框,其中某些特定行存在警报。数据框如下所示-

machineID    time    vibration    alerts
    1     2023-02-15    220          1
           11:45  
    1     2023-02-15    221          0
           12:00
    1     2023-02-15    219          0
           12:15
    1     2023-02-15    220          1
           12:30    
    1     2023-02-16    220          1
           11:45  
    1     2023-02-16    221          1
           12:00
    1     2023-02-16    219          0
           12:15
    1     2023-02-16    220          1
           12:30    

我想计算每天的alerts列的差异。但由于日期列是以15分钟的时间间隔,我不知道如何为整个一天进行分组,即计算每天的警报总和并将其与前一天的警报总和进行比较。

简而言之,我需要一种方法来计算每天的所有警报总和并减去前一天的总和。结果应该在另一个数据框中,其中有一个日期列和差异警报列。在这种情况下,新数据框将是-

time     diff_alerts
2023-02-16    1

因为在第二天,即2023年2月16日,有1个警报的差异。

英文:

I have a timeseries dataframe where there are alerts for some particular rows. The dataframe looks like-

machineID    time    vibration    alerts
    1     2023-02-15    220          1
           11:45  
    1     2023-02-15    221          0
           12:00
    1     2023-02-15    219          0
           12:15
    1     2023-02-15    220          1
           12:30    
    1     2023-02-16    220          1
           11:45  
    1     2023-02-16    221          1
           12:00
    1     2023-02-16    219          0
           12:15
    1     2023-02-16    220          1
           12:30    

I want to calculate difference of alerts columns for each day. But since the date column is in time interval of 15 minutes, I am not getting how to group for whole day i.e., sum the alerts for each day and compare it with the sum of all alerts of the previous day.

In short, I need a way to sum all alerts for each day and substract with previous day. The result should be in another dataframe where there is a date column and difference of alerts column. In this case, the new dataframe will be-

time     diff_alerts
2023-02-16    1

since there is difference of 1 alert on the next day i.e. 16-02-2023

答案1

得分: 5

Group by day with a custom pd.Grouper then sum alerts and finally compute the diff with the previous day:

(df.groupby(pd.Grouper(key='time', freq='D'))['alerts'].sum().diff()
   .dropna().rename('diff_alerts').astype(int).reset_index())

    time  diff_alerts
0 2023-02-16            1

Note: the second line of code is just here to have a clean output.

英文:

Group by day with a custom pd.Grouper then sum alerts and finally compute the diff with the previous day:

>>> (df.groupby(pd.Grouper(key='time', freq='D'))['alerts'].sum().diff()
       .dropna().rename('diff_alerts').astype(int).reset_index())

        time  diff_alerts
0 2023-02-16            1

Note: the second line of code is just here to have a clean output.

huangapple
  • 本文由 发表于 2023年2月16日 15:06:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/75468853.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定