2023年2月26日 22:09:09go评论102阅读模式

英文:

Replace missing values based on values in another column

问题

我有以下问题

我需要替换数据框中的NaN值

df1 = pd.DataFrame([[1001, np.NaN], [1001,'C'], [1004, 'D'],[1005, 'C'], 
                   [1005,'D'], [1010, np.NaN],[1010,np.NaN],[1010,'F']], columns=['CustomerNr','Costs'])

CustomerNr	Costs
1001	NaN
1004	D
1005	C
1010	NaN
1010	NaN

我尝试过:

df2 = pd.DataFrame([[1001, 'X'], [1010, 'Y']], columns=['CustomerNr','New Costs'])

期望输出:

CustomerNr	Costs
1001	X
1004	D
1005	C
1010	Y
1010	Y

英文:

I have the following problem

I need to replace NaN values in dataframe

df1 = pd.DataFrame([[1001, np.NaN], [1001,&#39;C&#39;], [1004, &#39;D&#39;],[1005, &#39;C&#39;], 
                   [1005,&#39;D&#39;], [1010, np.NaN],[1010,np.NaN],[1010,&#39;F&#39;]], columns=[&#39;CustomerNr&#39;,&#39;Costs&#39;])

CustomerNr	Costs
1001	NaN
1004	D
1005	C
1010	NaN
1010	NaN

I've tried:

df2 = pd.DataFrame([[1001, &#39;X&#39;], [1010, &#39;Y&#39;]], columns=[&#39;CustomerNr&#39;,&#39;New Costs&#39;])

Desired output:

CustomerNr	Costs
1001	X
1004	D
1005	C
1010	Y
1010	Y

答案1

得分: 1

使用系列映射（基于匹配的'CustomerNr'值）填充NA/NaN值：

df1['Costs'].fillna(df1['CustomerNr']
                   .map(df2.set_index('CustomerNr')['New Costs']), inplace=True)

   CustomerNr Costs
0        1001     X
1        1001     C
2        1004     D
3        1005     C
4        1005     D
5        1010     Y
6        1010     Y
7        1010     F


<details>
<summary>英文:</summary>
[Fill][1] `NA/NaN` values based on series mapping (on matched `&#39;CustomerNr&#39;` values):
    df1[&#39;Costs&#39;].fillna(df1[&#39;CustomerNr&#39;]
                        .map(df2.set_index(&#39;CustomerNr&#39;)[&#39;New Costs&#39;]), inplace=True)
----------
       CustomerNr Costs
    0        1001     X
    1        1001     C
    2        1004     D
    3        1005     C
    4        1005     D
    5        1010     Y
    6        1010     Y
    7        1010     F
  [1]: https://pandas.pydata.org/docs/reference/api/pandas.Series.fillna.html
</details>
# 答案2
**得分**: 0
我认为你可以使用类似这样的代码：
```python
import pandas as pd
import numpy as np
df1 = pd.DataFrame([[1001, np.NaN], [1001,'C'], [1004, 'D'], [1005, 'C'], 
                   [1005,'D'], [1010, np.NaN],[1010,np.NaN],[1010,'F']], columns=['CustomerNr','Costs'])
replace_dict = {1001: "X", 1010: "Y"}
df1['Costs'] = df1.apply(lambda x: replace_dict.get(x['CustomerNr']) if pd.isna(x['Costs']) else x['Costs'], axis=1)

解释：创建一个字典（replace_dict），根据CustomerNr列的值来映射要分配的值，然后使用apply()将这些值分配给Costs列，如果CustomerNr列的值是nan，则应用Costs列的原始值。

英文:

I think you could use something like this

import pandas as pd
import numpy as np
df1 = pd.DataFrame([[1001, np.NaN], [1001,&#39;C&#39;], [1004, &#39;D&#39;],[1005, &#39;C&#39;], 
                   [1005,&#39;D&#39;], [1010, np.NaN],[1010,np.NaN],[1010,&#39;F&#39;]], columns=[&#39;CustomerNr&#39;,&#39;Costs&#39;])
replace_dict = {1001:&quot;X&quot;,1010:&quot;Y&quot;}
df1[&#39;Costs&#39;] = df1.apply(lambda x: replace_dict.get(x[&#39;CustomerNr&#39;]) if pd.isna(x[&#39;Costs&#39;]) else x[&#39;Costs&#39;], axis=1)

Explanation: creates a dictionary (replace_dict) that maps the values to assign based on the value of CustomerNr column and use apply.() to assign those values if the value in CustomerNr is nan, else apply the original value of Costs

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据另一列中的数值替换缺失数值。

问题

答案1

在循环内创建一个序列的 Python 数组？

有一种方法可以找到在切换到另一个索引值之前的每个最大值吗？

使用VS Code中的Azure Functions V2 Python编程模型。

pandas dataframe：如何更新Hive表中的特定行

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。