在Pandas中计算行之间的差异

huangapple go评论111阅读模式
英文:

Calculate difference of rows in Pandas

问题

我有一个时间序列的数据框,其中某些特定行存在警报。数据框如下所示-

  1. machineID time vibration alerts
  2. 1 2023-02-15 220 1
  3. 11:45
  4. 1 2023-02-15 221 0
  5. 12:00
  6. 1 2023-02-15 219 0
  7. 12:15
  8. 1 2023-02-15 220 1
  9. 12:30
  10. 1 2023-02-16 220 1
  11. 11:45
  12. 1 2023-02-16 221 1
  13. 12:00
  14. 1 2023-02-16 219 0
  15. 12:15
  16. 1 2023-02-16 220 1
  17. 12:30

我想计算每天的alerts列的差异。但由于日期列是以15分钟的时间间隔,我不知道如何为整个一天进行分组,即计算每天的警报总和并将其与前一天的警报总和进行比较。

简而言之,我需要一种方法来计算每天的所有警报总和并减去前一天的总和。结果应该在另一个数据框中,其中有一个日期列和差异警报列。在这种情况下,新数据框将是-

  1. time diff_alerts
  2. 2023-02-16 1

因为在第二天,即2023年2月16日,有1个警报的差异。

英文:

I have a timeseries dataframe where there are alerts for some particular rows. The dataframe looks like-

  1. machineID time vibration alerts
  2. 1 2023-02-15 220 1
  3. 11:45
  4. 1 2023-02-15 221 0
  5. 12:00
  6. 1 2023-02-15 219 0
  7. 12:15
  8. 1 2023-02-15 220 1
  9. 12:30
  10. 1 2023-02-16 220 1
  11. 11:45
  12. 1 2023-02-16 221 1
  13. 12:00
  14. 1 2023-02-16 219 0
  15. 12:15
  16. 1 2023-02-16 220 1
  17. 12:30

I want to calculate difference of alerts columns for each day. But since the date column is in time interval of 15 minutes, I am not getting how to group for whole day i.e., sum the alerts for each day and compare it with the sum of all alerts of the previous day.

In short, I need a way to sum all alerts for each day and substract with previous day. The result should be in another dataframe where there is a date column and difference of alerts column. In this case, the new dataframe will be-

  1. time diff_alerts
  2. 2023-02-16 1

since there is difference of 1 alert on the next day i.e. 16-02-2023

答案1

得分: 5

Group by day with a custom pd.Grouper then sum alerts and finally compute the diff with the previous day:

  1. (df.groupby(pd.Grouper(key='time', freq='D'))['alerts'].sum().diff()
  2. .dropna().rename('diff_alerts').astype(int).reset_index())
  3. time diff_alerts
  4. 0 2023-02-16 1

Note: the second line of code is just here to have a clean output.

英文:

Group by day with a custom pd.Grouper then sum alerts and finally compute the diff with the previous day:

  1. >>> (df.groupby(pd.Grouper(key='time', freq='D'))['alerts'].sum().diff()
  2. .dropna().rename('diff_alerts').astype(int).reset_index())
  3. time diff_alerts
  4. 0 2023-02-16 1

Note: the second line of code is just here to have a clean output.

huangapple
  • 本文由 发表于 2023年2月16日 15:06:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/75468853.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定