英文:
Applying function based on condition on pandas dataframe series
问题
我是Pandas的新手
我的数据框架:
**df**
A B
first True
second False
third False
fourth True
fifth False
**期望的输出**
A B C
first True en
second False
third False
fourth True en
fifth False
我试图仅在`B`列为`True`时对`C`列应用函数。
**我使用的代码**
```python
if (df['B'] == True)):
df['C'] = df['A'].apply(
lambda x: TextBlob(x).detect_language())
但是我遇到了一个错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
我尝试过的
df['B'].bool()
df['B'] is True
df['B'] == 'True'
但是错误仍然存在,不确定如何构建一个说'仅在B列为True时'的语句。
感谢您的建议。
<details>
<summary>英文:</summary>
I am new to Pandas
My dataframe:
**df**
A B
first True
second False
third False
fourth True
fifth False
**Desired output**
A B C
first True en
second False
third False
fourth True en
fifth False
I am trying to apply a function to column `C` only when the `B` column is `True`.
**What I use**
if (df['B'] == True)):
df['C'] = df['A'].apply(
lambda x: TextBlob(x).detect_language())
But I get an error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
**What I've tried**
df['B'].bool()
df['B'] is True
df['B'] == 'True'
But the error persists, not sure how I would form a statement saying 'only where column B is True'.
Thank you for your suggestions.
</details>
# 答案1
**得分**: 3
如果希望在没有匹配行的情况下获取缺失值,请在“apply”之前过滤行,仅处理具有“True”的行:
```python
df['C'] = df.loc[df['B'], 'A'].apply(lambda x: TextBlob(x).detect_language())
print(df)
A B C
0 first True en
1 second False NaN
2 third False NaN
3 fourth True en
4 fifth False NaN
或者,如果需要空字符串来表示非匹配的值,但要处理所有列,请使用以下代码:
df['C'] = np.where(df['B'], df['A'].apply(lambda x: TextBlob(x).detect_language()), '')
print(df)
A B C
0 first True en
1 second False
2 third False
3 fourth True en
4 fifth False
英文:
If want missing values for no matched rows filter rows before apply
for processing only rows with True
s:
df['C'] = df.loc[df['B'], 'A'].apply(lambda x: TextBlob(x).detect_language())
print (df)
A B C
0 first True en
1 second False NaN
2 third False NaN
3 fourth True en
4 fifth False NaN
Or if need empty strings for non matched values, but apply
processing all columns:
df['C'] = np.where(df['B'], df['A'].apply(lambda x: TextBlob(x).detect_language()), '')
print (df)
A B C
0 first True en
1 second False
2 third False
3 fourth True en
4 fifth False
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论