2023年6月9日 05:42:27go评论94阅读模式

英文:

How can I begin calculating cumsum() on a specific date within a dataframe?

问题

有没有一种方法在Pandas数据框内从特定日期开始计算cumsum()？

给定以下数据框，我能够计算所有行的cumsum()。

import pandas as pd
df = pd.DataFrame([
    {'Date': '2022-01-01', 'Confirmed': 7 },
    {'Date': '2022-01-02', 'Confirmed': 4 },
    {'Date': '2022-01-03', 'Confirmed': 12 },
    {'Date': '2022-01-03', 'Confirmed': 2 },
    {'Date': '2022-01-04', 'Confirmed': 9 },
    {'Date': '2022-01-05', 'Confirmed': 10 },
])
df["Total Confirmed"]  = df["Confirmed"].cumsum()

然而，我想计算从特定日期开始的cumsum()。例如，我想从第一次出现的2022-01-03开始计算cumsum()，结果如下所示：

我注意到有shift()属性，但似乎只针对行数，并且仍然从第一行开始cumsum()。

英文:

Is there a way to start computing cumsum() on a specific date within a Pandas dataframe?

Given the following dataframe, I am able to calculate the cumsum() for all of the rows.

import pandas as pd
df = pd.DataFrame([
    {&#39;Date&#39;: &#39;2022-01-01&#39;, &#39;Confirmed&#39;: 7 },
    {&#39;Date&#39;: &#39;2022-01-02&#39;, &#39;Confirmed&#39;: 4 },
    {&#39;Date&#39;: &#39;2022-01-03&#39;, &#39;Confirmed&#39;: 12 },
    {&#39;Date&#39;: &#39;2022-01-03&#39;, &#39;Confirmed&#39;: 2 },
    {&#39;Date&#39;: &#39;2022-01-04&#39;, &#39;Confirmed&#39;: 9 },
    {&#39;Date&#39;: &#39;2022-01-05&#39;, &#39;Confirmed&#39;: 10 },
])
df[&quot;Total Confirmed&quot;]  = df[&quot;Confirmed&quot;].cumsum()

However, I would like to calculate the cumsum() starting on a specific date. For example, I would like to begin calculating cumsum() on the first occurrence of 2022-01-03 which would end up looking like this:

I noticed that there is the shift() property but that only seems to be specific to the number of rows and it still starts the cumsum() from the first row.

答案1

得分: 1

你可以尝试：

ser = df["Confirmed"].where(df["Date"].eq("2022-01-03").cummax(), 0)
df["Total Confirmed"] = ser.cumsum()

另一种变体：

df.iloc[:df["Date"].eq("2022-01-03").idxmax()] = np.nan
df["Total Confirmed"] = df["Confirmed"].cumsum().fillna(0, downcast="infer")

输出：

print(df)
         Date  Confirmed  Total Confirmed
0  2022-01-01          7                0
1  2022-01-02          4                0
2  2022-01-03         12               12
3  2022-01-03          2               14
4  2022-01-04          9               23
5  2022-01-05         10               33

英文:

You can try :

ser = df[&quot;Confirmed&quot;].where(df[&quot;Date&quot;].eq(&quot;2022-01-03&quot;).cummax(), 0)
df[&quot;Total Confirmed&quot;] = ser.cumsum()

Another variant :

df.iloc[:df[&quot;Date&quot;].eq(&quot;2022-01-03&quot;).idxmax()] = np.nan
df[&quot;Total Confirmed&quot;] = df[&quot;Confirmed&quot;].cumsum().fillna(0, downcast=&quot;infer&quot;)

Output :

print(df)
         Date  Confirmed  Total Confirmed
0  2022-01-01          7                0
1  2022-01-02          4                0
2  2022-01-03         12               12
3  2022-01-03          2               14
4  2022-01-04          9               23
5  2022-01-05         10               33

答案2

得分: 1

"(df["Confirmed"] * (df["Date"] >= "2022-01-03")).cumsum()" 可以翻译为："(df["Confirmed"] * (df["Date"] >= "2022-01-03")).cumsum()"。

英文:

(df["Confirmed"] * (df["Date"] >= "2022-01-03")).cumsum()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在数据框中从特定日期开始计算cumsum()？

问题

答案1

答案2

使用Python将内容追加到一个JSON文件中

根据其他列中的True/False 如何创建新列？

根据字符串和条件来拆分Python Pandas数据框。

如何在Python3交互式控制台中隐藏 “>>> ” 提示？是否有一个标志用于此目的？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。