2023年2月16日 15:06:34go评论111阅读模式

英文:

Calculate difference of rows in Pandas

问题

我有一个时间序列的数据框，其中某些特定行存在警报。数据框如下所示-

machineID    time    vibration    alerts
    1     2023-02-15    220          1
           11:45  
    1     2023-02-15    221          0
           12:00
    1     2023-02-15    219          0
           12:15
    1     2023-02-15    220          1
           12:30    
    1     2023-02-16    220          1
           11:45  
    1     2023-02-16    221          1
           12:00
    1     2023-02-16    219          0
           12:15
    1     2023-02-16    220          1
           12:30

我想计算每天的alerts列的差异。但由于日期列是以15分钟的时间间隔，我不知道如何为整个一天进行分组，即计算每天的警报总和并将其与前一天的警报总和进行比较。

简而言之，我需要一种方法来计算每天的所有警报总和并减去前一天的总和。结果应该在另一个数据框中，其中有一个日期列和差异警报列。在这种情况下，新数据框将是-

time     diff_alerts
2023-02-16    1

因为在第二天，即2023年2月16日，有1个警报的差异。

英文:

I have a timeseries dataframe where there are alerts for some particular rows. The dataframe looks like-

machineID    time    vibration    alerts
    1     2023-02-15    220          1
           11:45  
    1     2023-02-15    221          0
           12:00
    1     2023-02-15    219          0
           12:15
    1     2023-02-15    220          1
           12:30    
    1     2023-02-16    220          1
           11:45  
    1     2023-02-16    221          1
           12:00
    1     2023-02-16    219          0
           12:15
    1     2023-02-16    220          1
           12:30

I want to calculate difference of alerts columns for each day. But since the date column is in time interval of 15 minutes, I am not getting how to group for whole day i.e., sum the alerts for each day and compare it with the sum of all alerts of the previous day.

In short, I need a way to sum all alerts for each day and substract with previous day. The result should be in another dataframe where there is a date column and difference of alerts column. In this case, the new dataframe will be-

time     diff_alerts
2023-02-16    1

since there is difference of 1 alert on the next day i.e. 16-02-2023

答案1

得分: 5

Group by day with a custom pd.Grouper then sum alerts and finally compute the diff with the previous day:

(df.groupby(pd.Grouper(key='time', freq='D'))['alerts'].sum().diff()
   .dropna().rename('diff_alerts').astype(int).reset_index())
    time  diff_alerts
0 2023-02-16            1

Note: the second line of code is just here to have a clean output.

英文:

Group by day with a custom pd.Grouper then sum alerts and finally compute the diff with the previous day:

&gt;&gt;&gt; (df.groupby(pd.Grouper(key=&#39;time&#39;, freq=&#39;D&#39;))[&#39;alerts&#39;].sum().diff()
       .dropna().rename(&#39;diff_alerts&#39;).astype(int).reset_index())
        time  diff_alerts
0 2023-02-16            1

Note: the second line of code is just here to have a clean output.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Pandas中计算行之间的差异

问题

答案1

如何迭代列表中的函数，以便将不同的项分开？

I am trying to make a email chatbot but it spams how could i fix this?

在Python中为NxN矩阵转置创建结果矩阵时感到困惑。

Python : How to split the given start date and end date in a dataframe into number of days falling in each month creating new row for every date split

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。