2023年3月3日 20:33:42go评论78阅读模式

英文:

How can I subtract from subsequent column in pandas?

问题

我可以用不写死数值的方式，对许多列进行自减操作吗？我知道可以使用 shift 函数来逐行处理，那么是否可以逐列操作呢？有 100 列需要使用这种技巧，我希望能够以更灵活的方式来实现，而不是硬编码。请注意，“a_diff”是故意从一个常数减去的，因为在我的代码中我还需要将该列减去一个常数。感谢！

Sam

英文:

How can I subtract from column before itself for many columns without hardcoding it? I can do it by hard coding it as shown below:

import pandas as pd
df = pd.DataFrame({&quot;a&quot;:[1,2,3,4],&quot;b&quot;:[1,3,5,6],&quot;c&quot;:[6,7,8,9]})

df[&#39;a_diff&#39;] = df[&#39;a&#39;]-16
df[&#39;b_diff&#39;] = df[&#39;b&#39;]-df[&#39;a&#39;]
df[&#39;c_diff&#39;] = df[&#39;c&#39;]-df[&#39;b&#39;]

I know there is a way to do it rowwise by using shift function. Can we also it as column wise? There are 100 columns I need to use this technique on so I would rather do it pythonically instead of hard coding it. Please note that "a_diff" was subtracted from constant intentionally since I will have to subtract that column by constant in my code as well.

Thank you,

Sam

答案1

得分: 1

使用diff和combine_first（或带有一些限制的fillna），然后使用add_suffix和join将其添加到原始DataFrame：

out = df.join(df.diff(axis=1).combine_first(df[['a']].sub(16)).add_suffix('_diff'))

或者，如果您确信除了"a"列之外的列中没有NaN值：

out = df.join(df.diff(axis=1).fillna(df['a'].sub(16)).add_suffix('_diff'))

输出结果：

   a  b  c  a_diff  b_diff  c_diff
0  1  1  6   -15.0       0       5
1  2  3  7   -14.0       1       4
2  3  5  8   -13.0       2       3
3  4  6  9   -12.0       2       3

英文:

Use diff and combine_first (or fillna with some limitations!) then rename with add_suffix and join to the original DataFrame:

out = df.join(df.diff(axis=1).combine_first(df[[&#39;a&#39;]].sub(16)).add_suffix(&#39;_diff&#39;))

Or, if you are sure that there is no NaN in the columns other than "a":

out = df.join(df.diff(axis=1).fillna(df[&#39;a&#39;].sub(16)).add_suffix(&#39;_diff&#39;))

Output:

   a  b  c  a_diff  b_diff  c_diff
0  1  1  6   -15.0       0       5
1  2  3  7   -14.0       1       4
2  3  5  8   -13.0       2       3
3  4  6  9   -12.0       2       3

答案2

得分: 0

使用DataFrame.diff与DataFrame.fillna和DataFrame.add_suffix设置第一列，然后通过DataFrame.join附加到原始数据框：

df = df.join(df.diff(axis=1).fillna({'a': df['a'].sub(16)}).add_suffix('_diff'))

不指定硬编码第一列的解决方案：

first = df.columns[0]
df = df.join(df.diff(axis=1).fillna({first: df[first].sub(16)}).add_suffix('_diff'))

或者通过差异设置第一列：

df1 = df.diff(axis=1)
df1.iloc[:, 0] = df.iloc[:, 0].sub(16)
df = df.join(df1.add_suffix('_diff'))

如果原始数据框中不存在缺失值的解决方案：

df = df.join(df.diff(axis=1).fillna(df.sub(16)).add_suffix('_diff'))

print(df)
   a  b  c  a_diff  b_diff  c_diff
0  1  1  6    -15      0      5
1  2  3  7    -14      1      4
2  3  5  8    -13      2      3
3  4  6  9    -12      2      3

英文:

Use DataFrame.diff with set first column by DataFrame.fillna and DataFrame.add_suffix, last append to original by DataFrame.join:

df = df.join(df.diff(axis=1).fillna({&#39;a&#39;: df[&#39;a&#39;].sub(16)}).add_suffix(&#39;_diff&#39;))

Solution without specify hardcoding first column:

first  = df.columns[0]
df = df.join(df.diff(axis=1).fillna({first: df[first].sub(16)}).add_suffix(&#39;_diff&#39;))

Or set first column by difference:

df1 = df.diff(axis=1)
df1.iloc[:, 0] = df.iloc[:, 0].sub(16)
df = df.join(df1.add_suffix(&#39;_diff&#39;))

Solution if not exist missing values in original DataFrame:

df = df.join(df.diff(axis=1).fillna(df.sub(16)).add_suffix(&#39;_diff&#39;))

print (df)
   a  b  c  a_diff  b_diff  c_diff
0  1  1  6     -15       0       5
1  2  3  7     -14       1       4
2  3  5  8     -13       2       3
3  4  6  9     -12       2       3

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

你可以在 pandas 中如何从后续列中减去数值？

问题

答案1

答案2

How to disable debugger warnings about frozen modules when using nbconvert.ExecutePreprocessor in python script?

Python TkInter中Entry框的get方法

“Can’t run pytest with tmpdir: ‘AttributeError: module ‘py’ has no attribute ‘path'”

TypeError: WebDriver.init() got multiple values for argument ‘options’

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论