英文:
Slice a string in np.where
问题
data = [10, 20, 30, 40, 50, 60]
df = pd.DataFrame(data, columns=['Numbers'])
df['add_7'] = (df['Numbers'] + 7)
# You can achieve your desired result using a lambda function in np.where.
df['first_digit'] = np.where(df['add_7'] % 3 == 0, df['Numbers'].apply(lambda x: str(x)[0]), 'not a multiple of three')
这是你期望的结果:
| Numbers | add_7 | first_digit |
| ------- | ----- | ----------- |
| 10 | 17 | not a multiple of three |
| 20 | 27 | 2 |
| 30 | 37 | not a multiple of three |
| 40 | 47 | not a multiple of three |
| 50 | 57 | 5 |
| 60 | 67 | not a multiple of three |
这里使用了 apply
函数来获取每行中 Numbers
列的首位数字。
英文:
data = [10,20,30,40,50,60]
df = pd.DataFrame(data, columns=['Numbers'])
df['add_7'] = (df['Numbers'] + 7)
Here, I have a dataframe that looks like this:
Numbers | add_7 |
---|---|
10 | 17 |
20 | 27 |
30 | 37 |
40 | 47 |
50 | 57 |
60 | 67 |
What I want to accomplish, is that if the add_7
column is a multiple of 3, then I want the first digit of Number
as a string, otherwise "not a multiple of three", as a new column named "first_digit".
Numbers | add_7 | first_digit |
---|---|---|
10 | 17 | not a multiple of three |
20 | 27 | 2 |
30 | 37 | not a multiple of three |
40 | 47 | not a multiple of three |
50 | 57 | 5 |
60 | 67 | not a multiple of three |
I tried the following, but it seems that inside np.where, df['Numbers']
is still a series instead of a single field, thus df['Numbers'][0]
will always return 10.
df['first_digit'] = np.where(df['add_7'] % 3 == 0, str(df['Numbers'][0]), 'not a multiple of three')
Numbers | add_7 | first_digit |
---|---|---|
10 | 17 | not a multiple of three |
20 | 27 | 10 |
30 | 37 | not a multiple of three |
40 | 47 | not a multiple of three |
50 | 57 | 10 |
60 | 67 | not a multiple of three |
What is the right way to specify that I only want to operate on the field of this row, not the entire column, in np.where?
答案1
得分: 1
只返回翻译好的部分:
你需要将该列转化为字符串,然后获取每个值的第一个元素:
df["Numbers"].astype(str).str[0]
请注意,我们使用.str[0]
来访问该列中每个值的第一个元素;[0]
仍然会访问单个值,即"10"。
英文:
you're close:
> str(df['Numbers'][0])
This looks at the 0th value of the column, and then stringifies that scalar, i.e, you get "10".
You need to stringify the column, and then get the 0th element of each value:
df["Numbers"].astype(str).str[0]
Note that we use .str[0]
to access 0th element of each value in the column; [0]
would still access a single value, i.e., "10".
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论