2023年4月19日 22:42:09go评论168阅读模式

英文:

Pandas not summing values in two numeric columns

问题

I have a dataframe like this:

A	B
2	DIV0
3	DIV0
5	DIV0
DIV0	3

I want to add a 3rd column 'C' which would be the sum of values in A & B:

A	B	C
2	DIV0	2
3	DIV0	3
5	DIV0	5
DIV0	3	3

In my current code, the DIV0 values are removed and A and B are summed by the following lines:

df["A"] = pd.to_numeric(df["A"], errors="coerce")
df["B"] = pd.to_numeric(df["B"], errors="coerce")
df["C"] = df["A"] + df["B"]

However, this gives me an empty C column. I've tried researching numeric columns but can't understand why this is happening. Thanks.

英文:

I have a dataframe like this:

A	B
2	DIV0
3	DIV0
5	DIV0
DIV0	3

I want to add a 3rd column 'C' which would be the sum of values in A & B:

A	B	C
2	DIV0	2
3	DIV0	3
5	DIV0	5
DIV0	3	3

In my current code, the DIV0 values are removed and A and B are summed by the following lines:

df[&quot;A&quot;] = pd.to_numeric(df[&quot;A&quot;],errors=&quot;coerce&quot;)
df[&quot;B&quot;] = pd.to_numeric(df[&quot;B&quot;],errors=&quot;coerce&quot;)
df[&quot;C&quot;] = df[&quot;A&quot;] + df[&quot;B&quot;]

However this gives me an empty C column - I've tried researching numeric columns but can't understand why this is happening?
thanks

答案1

得分: 3

这是你要翻译的部分：

这是因为add的默认fill_value是NaN，当你进行涉及NaN的算术操作（例如+）时，结果也是NaN。所以你需要将填充值设置为0。

s1 = pd.to_numeric(df["A"], errors="coerce")
s2 = pd.to_numeric(df["B"], errors="coerce")
df["C"] = s1.add(s2, fill_value=0)

另一种变体（如果你有很多列）使用sum：

df["C"] = df.apply(pd.to_numeric, errors="coerce").sum(axis=1)

输出：

print(df)
      A     B    C
0     2  DIV0  2.0
1     3  DIV0  3.0
2     5  DIV0  5.0
3  DIV0     3  3.0

英文:

That's because the default fill_value of add is NaN and when you perform an arithmetic operation (like +) involving NaN, the result is also NaN. So you need to set the fill value to 0.

s1 = pd.to_numeric(df[&quot;A&quot;], errors=&quot;coerce&quot;)
s2 = pd.to_numeric(df[&quot;B&quot;], errors=&quot;coerce&quot;)
df[&quot;C&quot;] = s1.add(s2, fill_value=0)

Another variant (if you have a lot of columns) with sum :

df[&quot;C&quot;] = df.apply(pd.to_numeric, errors=&quot;coerce&quot;).sum(axis=1)

Output :

print(df)
      A     B    C
0     2  DIV0  2.0
1     3  DIV0  3.0
2     5  DIV0  5.0
3  DIV0     3  3.0

答案2

得分: 1

你可以这样做：

df['sum'] = (df[['a', 'b']].apply(lambda x: pd.to_numeric(x, errors='coerce')).sum(axis=1, min_count=1))

输出：

英文:

You could do something like this:

df[&#39;sum&#39;] = (df[[&#39;a&#39;, &#39;b&#39;]].apply(lambda x: pd.to_numeric(x, errors=&#39;coerce&#39;)).sum(axis=1, min_count=1))

Output:

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Pandas不对两个数值列中的值求和。

问题

答案1

答案2

为什么我的环境变量会添加空格？

Flask加载app.py文件和不存在的页面。

在matplotlib中为堆叠条形图顶部添加注释

扩展表格以适应R或Python中的日期范围？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。