2023年3月3日 23:57:15go评论88阅读模式

英文:

Slice a string in np.where

问题

data = [10, 20, 30, 40, 50, 60]
df = pd.DataFrame(data, columns=['Numbers'])
df['add_7'] = (df['Numbers'] + 7)
# You can achieve your desired result using a lambda function in np.where.
df['first_digit'] = np.where(df['add_7'] % 3 == 0, df['Numbers'].apply(lambda x: str(x)[0]), 'not a multiple of three')

这是你期望的结果：

| Numbers | add_7 | first_digit |
| ------- | ----- | ----------- |
| 10      | 17    | not a multiple of three |
| 20      | 27    | 2 |
| 30      | 37    | not a multiple of three |
| 40      | 47    | not a multiple of three |
| 50      | 57    | 5 |
| 60      | 67    | not a multiple of three |

这里使用了 apply 函数来获取每行中 Numbers 列的首位数字。

英文:

data = [10,20,30,40,50,60]    
df = pd.DataFrame(data, columns=[&#39;Numbers&#39;])
df[&#39;add_7&#39;] = (df[&#39;Numbers&#39;] + 7)

Here, I have a dataframe that looks like this:

Numbers	add_7
10	17
20	27
30	37
40	47
50	57
60	67

What I want to accomplish, is that if the add_7 column is a multiple of 3, then I want the first digit of Number as a string, otherwise "not a multiple of three", as a new column named "first_digit".

Numbers	add_7	first_digit
10	17	not a multiple of three
20	27	2
30	37	not a multiple of three
40	47	not a multiple of three
50	57	5
60	67	not a multiple of three

I tried the following, but it seems that inside np.where, df['Numbers'] is still a series instead of a single field, thus df['Numbers'][0] will always return 10.

 df[&#39;first_digit&#39;] = np.where(df[&#39;add_7&#39;] % 3 == 0, str(df[&#39;Numbers&#39;][0]), &#39;not a multiple of three&#39;)

Numbers	add_7	first_digit
10	17	not a multiple of three
20	27	10
30	37	not a multiple of three
40	47	not a multiple of three
50	57	10
60	67	not a multiple of three

What is the right way to specify that I only want to operate on the field of this row, not the entire column, in np.where?

答案1

得分: 1

只返回翻译好的部分:

你需要将该列转化为字符串，然后获取每个值的第一个元素：

df["Numbers"].astype(str).str[0]

请注意，我们使用.str[0]来访问该列中每个值的第一个元素；[0]仍然会访问单个值，即"10"。

英文:

you're close:
> str(df['Numbers'][0])

This looks at the 0th value of the column, and then stringifies that scalar, i.e, you get "10".

You need to stringify the column, and then get the 0th element of each value:

df[&quot;Numbers&quot;].astype(str).str[0]

Note that we use .str[0] to access 0th element of each value in the column; [0] would still access a single value, i.e., "10".

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在`np.where`中切片一个字符串。

问题

答案1

Pandas筛选一列，但仅当另一列小于指定值时。

为什么文本列出现浮点数数据类型错误？

创建一个基于匹配字符串的新列。

Fastest way to compute n-gram overlap matrix in Python

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。