如何使用方法链接替换特定列中的字符串值?

huangapple go评论65阅读模式
英文:

How do I replace a string-value in a specific column using method chaining?

问题

我有一个pandas数据框其中一些字符串值是"NA"我想使用方法链接来替换特定列例如下面的'strCol'中的这些值

我该如何做?(尽管这应该很容易但我进行了相当多的谷歌搜索但没有成功...

这是一个简单的示例

```python
import pandas as pd

df = pd.DataFrame({'A':[1,2,3,4],
                   'B':['val1','val2','NA','val3']})
df = (
    df
    .rename(columns={'A':'intCol', 'B':'strCol'})  # 方法链接示例操作1
    .astype({'intCol':float})  # 方法链接示例操作2
    # .where(df['strCol']=='NA', pd.NA) # 如何在这里替换字符串'NA'?这种方法不起作用...
    )
df

<details>
<summary>英文:</summary>

I have a pandas data frame, where some string values are &quot;NA&quot;. I want to replace these values in a specific column (i.e. the &#39;strCol&#39; in the example below) using method chaining.

How do I do this? (I googled quite a bit without success even though this should be easy?! ...)

Here is a minimal example:

```python
import pandas as pd

df = pd.DataFrame({&#39;A&#39;:[1,2,3,4],
                   &#39;B&#39;:[&#39;val1&#39;,&#39;val2&#39;,&#39;NA&#39;,&#39;val3&#39;]})
df = (
    df
    .rename(columns={&#39;A&#39;:&#39;intCol&#39;, &#39;B&#39;:&#39;strCol&#39;})  # method chain example operation 1
    .astype({&#39;intCol&#39;:float})  # method chain example operation 2
    # .where(df[&#39;strCol&#39;]==&#39;NA&#39;, pd.NA) # how to replace the sting &#39;NA&#39; here? this does not work ...
    )
df

答案1

得分: 2

你可以尝试使用replace而不是where

df.replace({'strCol': {'NA': pd.NA}})
英文:

You can try replace instead of where:

df.replace({&#39;strCol&#39;:{&#39;NA&#39;:pd.NA}})

答案2

得分: 0

使用lambdawhere子句中来评估链式数据框:

df = (df.rename(columns={'A':'intCol', 'B':'strCol'})
        .astype({'intCol':float})
        .where(lambda x: x['strCol']=='NA', pd.NA))

输出:

>>> df
   intCol strCol
0     NaN   <NA>
1     NaN   <NA>
2     3.0     NA
3     NaN   <NA>

许多方法,如wheremaskgroupbyapply,可以接受一个可调用对象或函数,因此可以传递一个lambda函数。

英文:

Use lambda in where clause to evaluate the chained dataframe:

df = (df.rename(columns={&#39;A&#39;:&#39;intCol&#39;, &#39;B&#39;:&#39;strCol&#39;})
        .astype({&#39;intCol&#39;:float})
        .where(lambda x: x[&#39;strCol&#39;]==&#39;NA&#39;, pd.NA))

Output:

&gt;&gt;&gt; df
   intCol strCol
0     NaN   &lt;NA&gt;
1     NaN   &lt;NA&gt;
2     3.0     NA
3     NaN   &lt;NA&gt;

Many methods like where, mask, groupby, apply can take a callable or a function so you can pass a lambda function.

答案3

得分: 0

pandas.DataFrame.where

> 替换条件为假的值。

所以您需要在要进行替换的地方使条件 成立,以下是一个简单的示例

import pandas as pd
df = pd.DataFrame({&#39;x&#39;:[1,2,3,4,5,6,7,8,9]})
df2 = df.where(df.x%2==0,-1)
print(df2)

得到的输出是

   x
0 -1
1  2
2 -1
3  4
4 -1
5  6
6 -1
7  8
8 -1

注意,奇数 值被替换为 -1,而对于 偶数 值,条件成立。

英文:

pandas.DataFrame.where does

> Replace values where the condition is False.

So you need condition to not hold where you want to make replacement, simple example

import pandas as pd
df = pd.DataFrame({&#39;x&#39;:[1,2,3,4,5,6,7,8,9]})
df2 = df.where(df.x%2==0,-1)
print(df2)

gives output

   x
0 -1
1  2
2 -1
3  4
4 -1
5  6
6 -1
7  8
8 -1

Observe that odd values were replaced by -1s, whilst condition does hold for even values.

huangapple
  • 本文由 发表于 2023年2月7日 00:39:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75364133.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定