英文:
Replace value that does not equal specific values
问题
I have a data frame with a column answer
where respondents can answer yes
, no
, maybe
, or some other response. The some other response is free text and for my purposes, I just need to categorize it as other
but I can't quite figure out how to replace a value in a pandas dataframe that does not equal a few different values (I've seen answers for replacing values that do not equal one value, but I don't want it to effect the rows with yes, no, and maybe).
Sample data frame is below, any help is appreciated.
id | answer |
1 | 是 |
2 | 或许 |
3 | 不 |
4 | 不知道 |
5 | 什么 |
6 | 怎么 |
英文:
I've tried looking at other similar questions but haven't found an adequate answer for my needs.
I have a data frame with a column answer
where respondents can answer yes
, no
, maybe
, or some other response. The some other response is free text and for my purposes, I just need to categorize it as other
but I can't quite figure out how to replace a value in a pandas dataframe that does not equal a few different values (I've seen answers for replacing values that do not equal one value, but I don't want it to effect the rows with yes, no, and maybe).
Sample data frame is below, any help is appreciated.
id | answer |
1 | Yes |
2 | Maybe |
3 | No |
4 | idk |
5 | wtf |
6 | wth |
答案1
得分: 1
使用pandas.Series.where与您的条件
df['new_answer'] = df['answer'].where(df['answer'].isin(['Yes', 'No', 'Maybe']), 'other')
id answer new_answer
0 1 Yes Yes
1 2 Maybe Maybe
2 3 No No
3 4 idk other
4 5 wtf other
5 6 wth other
英文:
Use pandas.Series.where with your condition
df['new_answer'] = df['answer'].where(df['answer'].isin(['Yes', 'No', 'Maybe']), 'other')
id answer new_answer
0 1 Yes Yes
1 2 Maybe Maybe
2 3 No No
3 4 idk other
4 5 wtf other
5 6 wth other
答案2
得分: 1
Another option is to use categorical data :
cats = pd.Categorical(df["answer"], categories=["Yes", "No", "Maybe", "other"])
df["answer"] = pd.Series(cats).fillna("other")
Output :
print(df)
id answer
0 1 Yes
1 2 Maybe
2 3 No
3 4 other
4 5 other
5 6 other
英文:
Another option is to use categorical data :
cats = pd.Categorical(df["answer"], categories=["Yes", "No", "Maybe", "other"])
df["answer"] = pd.Series(cats).fillna("other")
Output :
print(df)
id answer
0 1 Yes
1 2 Maybe
2 3 No
3 4 other
4 5 other
5 6 other
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论