在`np.where`中切片一个字符串。

huangapple go评论66阅读模式
英文:

Slice a string in np.where

问题

data = [10, 20, 30, 40, 50, 60]
df = pd.DataFrame(data, columns=['Numbers'])
df['add_7'] = (df['Numbers'] + 7)

# You can achieve your desired result using a lambda function in np.where.
df['first_digit'] = np.where(df['add_7'] % 3 == 0, df['Numbers'].apply(lambda x: str(x)[0]), 'not a multiple of three')

这是你期望的结果:

| Numbers | add_7 | first_digit |
| ------- | ----- | ----------- |
| 10      | 17    | not a multiple of three |
| 20      | 27    | 2 |
| 30      | 37    | not a multiple of three |
| 40      | 47    | not a multiple of three |
| 50      | 57    | 5 |
| 60      | 67    | not a multiple of three |

这里使用了 apply 函数来获取每行中 Numbers 列的首位数字。

英文:
data = [10,20,30,40,50,60]    
df = pd.DataFrame(data, columns=['Numbers'])
df['add_7'] = (df['Numbers'] + 7)

Here, I have a dataframe that looks like this:

Numbers add_7
10 17
20 27
30 37
40 47
50 57
60 67

What I want to accomplish, is that if the add_7 column is a multiple of 3, then I want the first digit of Number as a string, otherwise "not a multiple of three", as a new column named "first_digit".

Numbers add_7 first_digit
10 17 not a multiple of three
20 27 2
30 37 not a multiple of three
40 47 not a multiple of three
50 57 5
60 67 not a multiple of three

I tried the following, but it seems that inside np.where, df['Numbers'] is still a series instead of a single field, thus df['Numbers'][0] will always return 10.

 df['first_digit'] = np.where(df['add_7'] % 3 == 0, str(df['Numbers'][0]), 'not a multiple of three')
Numbers add_7 first_digit
10 17 not a multiple of three
20 27 10
30 37 not a multiple of three
40 47 not a multiple of three
50 57 10
60 67 not a multiple of three

What is the right way to specify that I only want to operate on the field of this row, not the entire column, in np.where?

答案1

得分: 1

只返回翻译好的部分:

你需要将该列转化为字符串,然后获取每个值的第一个元素:

df["Numbers"].astype(str).str[0]

请注意,我们使用.str[0]来访问该列中每个值的第一个元素;[0]仍然会访问单个值,即"10"。

英文:

you're close:
> str(df['Numbers'][0])

This looks at the 0th value of the column, and then stringifies that scalar, i.e, you get "10".

You need to stringify the column, and then get the 0th element of each value:

df["Numbers"].astype(str).str[0]

Note that we use .str[0] to access 0th element of each value in the column; [0] would still access a single value, i.e., "10".

huangapple
  • 本文由 发表于 2023年3月3日 23:57:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/75629285.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定