在Pandas数据框系列上根据条件应用函数。

huangapple go评论127阅读模式
英文:

Applying function based on condition on pandas dataframe series

问题

我是Pandas的新手

我的数据框架:

**df**

A B
first True
second False
third False
fourth True
fifth False


**期望的输出**

A B C
first True en
second False
third False
fourth True en
fifth False


我试图仅在`B`列为`True`时对`C`列应用函数。

**我使用的代码**

```python
if (df['B'] == True)):
    df['C'] = df['A'].apply(
        lambda x: TextBlob(x).detect_language())

但是我遇到了一个错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我尝试过的

df['B'].bool()
df['B'] is True
df['B'] == 'True'

但是错误仍然存在,不确定如何构建一个说'仅在B列为True时'的语句。

感谢您的建议。


<details>
<summary>英文:</summary>

I am new to Pandas

My dataframe:

**df**

A B
first True
second False
third False
fourth True
fifth False


**Desired output**

A B C
first True en
second False
third False
fourth True en
fifth False


I am trying to apply a function to column `C` only when the `B` column is `True`.

**What I use**

if (df['B'] == True)):
df['C'] = df['A'].apply(
lambda x: TextBlob(x).detect_language())


But I get an error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().


**What I&#39;ve tried**

df['B'].bool()
df['B'] is True
df['B'] == 'True'

But the error persists, not sure how I would form a statement saying &#39;only where column B is True&#39;.

Thank you for your suggestions.


</details>


# 答案1
**得分**: 3

如果希望在没有匹配行的情况下获取缺失值,请在“apply”之前过滤行,仅处理具有“True”的行:

```python
df['C'] = df.loc[df['B'], 'A'].apply(lambda x: TextBlob(x).detect_language())
print(df)
        A      B    C
0   first   True   en
1  second  False  NaN
2   third  False  NaN
3  fourth   True   en
4   fifth  False  NaN

或者,如果需要空字符串来表示非匹配的值,但要处理所有列,请使用以下代码:

df['C'] = np.where(df['B'], df['A'].apply(lambda x: TextBlob(x).detect_language()), '')
print(df)
        A      B   C
0   first   True  en
1  second  False    
2   third  False    
3  fourth   True  en
4   fifth  False    
英文:

If want missing values for no matched rows filter rows before apply for processing only rows with Trues:

df[&#39;C&#39;] = df.loc[df[&#39;B&#39;], &#39;A&#39;].apply(lambda x: TextBlob(x).detect_language())
print (df)
        A      B    C
0   first   True   en
1  second  False  NaN
2   third  False  NaN
3  fourth   True   en
4   fifth  False  NaN

Or if need empty strings for non matched values, but apply processing all columns:

df[&#39;C&#39;] = np.where(df[&#39;B&#39;], df[&#39;A&#39;].apply(lambda x: TextBlob(x).detect_language()), &#39;&#39;)
print (df)
        A      B   C
0   first   True  en
1  second  False    
2   third  False    
3  fourth   True  en
4   fifth  False    

huangapple
  • 本文由 发表于 2020年1月6日 18:33:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/59610513.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定