2023年6月19日 21:15:19go评论102阅读模式

英文:

How can I backfill a decremental counter up a sparse pandas column?

问题

如何从“Column A”创建“Column B”？

    Column A  Column B
0        1.0       1.0
1        NaN       NaN
2        NaN       NaN
3        NaN       1.0
4        NaN       2.0
5        NaN       3.0
6        NaN       4.0
7        5.0       5.0
8        NaN       NaN
9        NaN       NaN
10       NaN       NaN
11       NaN       1.0
12       NaN       2.0
13       3.0       3.0
14       NaN       NaN
15       NaN       1.0
16       NaN       2.0
17       NaN       3.0
18       4.0       4.0

英文:

How to create "Column B" from "Column A"?

    Column A  Column B
0        1.0       1.0
1        NaN       NaN
2        NaN       NaN
3        NaN       1.0
4        NaN       2.0
5        NaN       3.0
6        NaN       4.0
7        5.0       5.0
8        NaN       NaN
9        NaN       NaN
10       NaN       NaN
11       NaN       1.0
12       NaN       2.0
13       3.0       3.0
14       NaN       NaN
15       NaN       1.0
16       NaN       2.0
17       NaN       3.0
18       4.0       4.0

答案1

得分: 2

# 假设空单元格为NaN，您可以使用自定义的groupby操作，配合降序的cumcount和transform('last')：
# 按照连续的NaN分组，以非NaN结束
g = df.groupby(df.loc[::-1, 'Column A'].notna().cumsum())['Column A']
# 计算降序cumcount，并从原始值中减去
s = g.transform('last').sub(g.cumcount(ascending=False))
# 保留严格正值
df['Column B'] = s.where(s.gt(0))

输出:

    Column A  Column B
0        1.0       1.0
1        NaN       NaN
2        NaN       NaN
3        NaN       1.0
4        NaN       2.0
5        NaN       3.0
6        NaN       4.0
7        5.0       5.0
8        NaN       NaN
9        NaN       NaN
10       NaN       NaN
11       NaN       1.0
12       NaN       2.0
13       3.0       3.0
14       NaN       NaN
15       NaN       1.0
16       NaN       2.0
17       NaN       3.0
18       4.0       4.0

中间结果:

    Column A  Column B  group  cumcount  last    s    s>0
0        1.0       1.0      4         0   1.0  1.0   True
1        NaN       NaN      3         6   5.0 -1.0  False
2        NaN       NaN      3         5   5.0  0.0  False
3        NaN       1.0      3         4   5.0  1.0   True
4        NaN       2.0      3         3   5.0  2.0   True
5        NaN       3.0      3         2   5.0  3.0   True
6        NaN       4.0      3         1   5.0  4.0   True
7        5.0       5.0      3         0   5.0  5.0   True
8        NaN       NaN      2         5   3.0 -2.0  False
9        NaN       NaN      2         4   3.0 -1.0  False
10       NaN       NaN      2         3   3.0  0.0  False
11       NaN       1.0      2         2   3.0  1.0   True
12       NaN       2.0      2         1   3.0  2.0   True
13       3.0       3.0      2         0   3.0  3.0   True
14       NaN       NaN      1         4   4.0  0.0  False
15       NaN       1.0      1         3   4.0  1.0   True
16       NaN       2.0      1         2   4.0  2.0   True
17       NaN       3.0      1         1   4.0  3.0   True
18       4.0       4.0      1         0   4.0  4.0   True


<details>
<summary>英文:</summary>
Assuming empty cells are NaNs, you can use a custom [`groupby`](https://pandas.pydata.org/docs/reference/api/pandas.Series.groupby.html) operation with a descending [`cumcount`](https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.SeriesGroupBy.cumcount.html) and [`transform(&#39;last&#39;)`](https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.SeriesGroupBy.transform.html):

group by successive NaNs ending on a non-NaN

g = df.groupby(df.loc[::-1, 'Column A'].notna().cumsum())['Column A']

compute the descending cumcount, subtract from original value

s = g.transform('last').sub(g.cumcount(ascending=False))

keep only strictly positive values

df['Column B'] = s.where(s.gt(0))

Output:

Column A  Column B

0 1.0 1.0
1 NaN NaN
2 NaN NaN
3 NaN 1.0
4 NaN 2.0
5 NaN 3.0
6 NaN 4.0
7 5.0 5.0
8 NaN NaN
9 NaN NaN
10 NaN NaN
11 NaN 1.0
12 NaN 2.0
13 3.0 3.0
14 NaN NaN
15 NaN 1.0
16 NaN 2.0
17 NaN 3.0
18 4.0 4.0

Intermediates:

Column A  Column B  group  cumcount  last    s    s&gt;0

0 1.0 1.0 4 0 1.0 1.0 True
1 NaN NaN 3 6 5.0 -1.0 False
2 NaN NaN 3 5 5.0 0.0 False
3 NaN 1.0 3 4 5.0 1.0 True
4 NaN 2.0 3 3 5.0 2.0 True
5 NaN 3.0 3 2 5.0 3.0 True
6 NaN 4.0 3 1 5.0 4.0 True
7 5.0 5.0 3 0 5.0 5.0 True
8 NaN NaN 2 5 3.0 -2.0 False
9 NaN NaN 2 4 3.0 -1.0 False
10 NaN NaN 2 3 3.0 0.0 False
11 NaN 1.0 2 2 3.0 1.0 True
12 NaN 2.0 2 1 3.0 2.0 True
13 3.0 3.0 2 0 3.0 3.0 True
14 NaN NaN 1 4 4.0 0.0 False
15 NaN 1.0 1 3 4.0 1.0 True
16 NaN 2.0 1 2 4.0 2.0 True
17 NaN 3.0 1 1 4.0 3.0 True
18 4.0 4.0 1 0 4.0 4.0 True


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将递减计数器回填到稀疏的Pandas列？

问题

答案1

group by successive NaNs ending on a non-NaN

compute the descending cumcount, subtract from original value

keep only strictly positive values

Python numpy: Add elements of a numpy array of arrays to elements of another array of arrays initialized to at the specified positions

“`python pd.DataFrame 如何计算 mean()，同时忽略某些单元格中的 ‘NA’ 字符串 “`

检查 tkinter 窗口的大小使用 “if” 语句。

如何衡量高度不平衡数据集的性能？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。