如何使用方法链接替换特定列中的字符串值?

huangapple go评论98阅读模式
英文:

How do I replace a string-value in a specific column using method chaining?

问题

  1. 我有一个pandas数据框其中一些字符串值是"NA"我想使用方法链接来替换特定列例如下面的'strCol'中的这些值
  2. 我该如何做?(尽管这应该很容易但我进行了相当多的谷歌搜索但没有成功...
  3. 这是一个简单的示例
  4. ```python
  5. import pandas as pd
  6. df = pd.DataFrame({'A':[1,2,3,4],
  7. 'B':['val1','val2','NA','val3']})
  8. df = (
  9. df
  10. .rename(columns={'A':'intCol', 'B':'strCol'}) # 方法链接示例操作1
  11. .astype({'intCol':float}) # 方法链接示例操作2
  12. # .where(df['strCol']=='NA', pd.NA) # 如何在这里替换字符串'NA'?这种方法不起作用...
  13. )
  14. df
  1. <details>
  2. <summary>英文:</summary>
  3. I have a pandas data frame, where some string values are &quot;NA&quot;. I want to replace these values in a specific column (i.e. the &#39;strCol&#39; in the example below) using method chaining.
  4. How do I do this? (I googled quite a bit without success even though this should be easy?! ...)
  5. Here is a minimal example:
  6. ```python
  7. import pandas as pd
  8. df = pd.DataFrame({&#39;A&#39;:[1,2,3,4],
  9. &#39;B&#39;:[&#39;val1&#39;,&#39;val2&#39;,&#39;NA&#39;,&#39;val3&#39;]})
  10. df = (
  11. df
  12. .rename(columns={&#39;A&#39;:&#39;intCol&#39;, &#39;B&#39;:&#39;strCol&#39;}) # method chain example operation 1
  13. .astype({&#39;intCol&#39;:float}) # method chain example operation 2
  14. # .where(df[&#39;strCol&#39;]==&#39;NA&#39;, pd.NA) # how to replace the sting &#39;NA&#39; here? this does not work ...
  15. )
  16. df

答案1

得分: 2

你可以尝试使用replace而不是where

  1. df.replace({'strCol': {'NA': pd.NA}})
英文:

You can try replace instead of where:

  1. df.replace({&#39;strCol&#39;:{&#39;NA&#39;:pd.NA}})

答案2

得分: 0

使用lambdawhere子句中来评估链式数据框:

  1. df = (df.rename(columns={'A':'intCol', 'B':'strCol'})
  2. .astype({'intCol':float})
  3. .where(lambda x: x['strCol']=='NA', pd.NA))

输出:

  1. >>> df
  2. intCol strCol
  3. 0 NaN <NA>
  4. 1 NaN <NA>
  5. 2 3.0 NA
  6. 3 NaN <NA>

许多方法,如wheremaskgroupbyapply,可以接受一个可调用对象或函数,因此可以传递一个lambda函数。

英文:

Use lambda in where clause to evaluate the chained dataframe:

  1. df = (df.rename(columns={&#39;A&#39;:&#39;intCol&#39;, &#39;B&#39;:&#39;strCol&#39;})
  2. .astype({&#39;intCol&#39;:float})
  3. .where(lambda x: x[&#39;strCol&#39;]==&#39;NA&#39;, pd.NA))

Output:

  1. &gt;&gt;&gt; df
  2. intCol strCol
  3. 0 NaN &lt;NA&gt;
  4. 1 NaN &lt;NA&gt;
  5. 2 3.0 NA
  6. 3 NaN &lt;NA&gt;

Many methods like where, mask, groupby, apply can take a callable or a function so you can pass a lambda function.

答案3

得分: 0

pandas.DataFrame.where

> 替换条件为假的值。

所以您需要在要进行替换的地方使条件 成立,以下是一个简单的示例

  1. import pandas as pd
  2. df = pd.DataFrame({&#39;x&#39;:[1,2,3,4,5,6,7,8,9]})
  3. df2 = df.where(df.x%2==0,-1)
  4. print(df2)

得到的输出是

  1. x
  2. 0 -1
  3. 1 2
  4. 2 -1
  5. 3 4
  6. 4 -1
  7. 5 6
  8. 6 -1
  9. 7 8
  10. 8 -1

注意,奇数 值被替换为 -1,而对于 偶数 值,条件成立。

英文:

pandas.DataFrame.where does

> Replace values where the condition is False.

So you need condition to not hold where you want to make replacement, simple example

  1. import pandas as pd
  2. df = pd.DataFrame({&#39;x&#39;:[1,2,3,4,5,6,7,8,9]})
  3. df2 = df.where(df.x%2==0,-1)
  4. print(df2)

gives output

  1. x
  2. 0 -1
  3. 1 2
  4. 2 -1
  5. 3 4
  6. 4 -1
  7. 5 6
  8. 6 -1
  9. 7 8
  10. 8 -1

Observe that odd values were replaced by -1s, whilst condition does hold for even values.

huangapple
  • 本文由 发表于 2023年2月7日 00:39:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75364133.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定