替换不等于特定值的数值。

huangapple go评论94阅读模式
英文:

Replace value that does not equal specific values

问题

I have a data frame with a column answer where respondents can answer yes, no, maybe, or some other response. The some other response is free text and for my purposes, I just need to categorize it as other but I can't quite figure out how to replace a value in a pandas dataframe that does not equal a few different values (I've seen answers for replacing values that do not equal one value, but I don't want it to effect the rows with yes, no, and maybe).

Sample data frame is below, any help is appreciated.

  1. id | answer |
  2. 1 | |
  3. 2 | 或许 |
  4. 3 | |
  5. 4 | 不知道 |
  6. 5 | 什么 |
  7. 6 | 怎么 |
英文:

I've tried looking at other similar questions but haven't found an adequate answer for my needs.

I have a data frame with a column answer where respondents can answer yes, no, maybe, or some other response. The some other response is free text and for my purposes, I just need to categorize it as other but I can't quite figure out how to replace a value in a pandas dataframe that does not equal a few different values (I've seen answers for replacing values that do not equal one value, but I don't want it to effect the rows with yes, no, and maybe).

Sample data frame is below, any help is appreciated.

  1. id | answer |
  2. 1 | Yes |
  3. 2 | Maybe |
  4. 3 | No |
  5. 4 | idk |
  6. 5 | wtf |
  7. 6 | wth |

答案1

得分: 1

使用pandas.Series.where与您的条件

  1. df['new_answer'] = df['answer'].where(df['answer'].isin(['Yes', 'No', 'Maybe']), 'other')
  1. id answer new_answer
  2. 0 1 Yes Yes
  3. 1 2 Maybe Maybe
  4. 2 3 No No
  5. 3 4 idk other
  6. 4 5 wtf other
  7. 5 6 wth other
英文:

Use pandas.Series.where with your condition

  1. df['new_answer'] = df['answer'].where(df['answer'].isin(['Yes', 'No', 'Maybe']), 'other')
  2. id answer new_answer
  3. 0 1 Yes Yes
  4. 1 2 Maybe Maybe
  5. 2 3 No No
  6. 3 4 idk other
  7. 4 5 wtf other
  8. 5 6 wth other

答案2

得分: 1

Another option is to use categorical data :

  1. cats = pd.Categorical(df["answer"], categories=["Yes", "No", "Maybe", "other"])
  2. df["answer"] = pd.Series(cats).fillna("other")

Output :

  1. print(df)
  2. id answer
  3. 0 1 Yes
  4. 1 2 Maybe
  5. 2 3 No
  6. 3 4 other
  7. 4 5 other
  8. 5 6 other
英文:

Another option is to use categorical data :

  1. cats = pd.Categorical(df["answer"], categories=["Yes", "No", "Maybe", "other"])
  2. df["answer"] = pd.Series(cats).fillna("other")

Output :

  1. print(df)
  2. id answer
  3. 0 1 Yes
  4. 1 2 Maybe
  5. 2 3 No
  6. 3 4 other
  7. 4 5 other
  8. 5 6 other

huangapple
  • 本文由 发表于 2023年5月18日 03:46:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76275699.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定