2023年7月6日 22:05:18go评论100阅读模式

英文:

what is the use of .str[0] in pandas dataframe

问题

我有一个名为df的数据框，其中有一个sales_area列，我需要删除列值开头的MV-。

销售区域
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten

所以我使用了分割函数

d['SALES_AREA'].str.split('-')[1]

获得结果

['MV', 'Schmidt Carsten']
当我尝试这样做

d['SALES_AREA'].str.split('-').str[1]
我得到了正确的结果

Schmidt Carsten
Schmidt Carsten
Schmidt Carsten
Schmidt Carsten
我得到了结果。但我不明白为什么在使用str[0]而不是[0]时会得到结果。

英文:

I am having a dataframe df which has sales_area column i have to remove the MV- from the beginning of the columns_value

             Sales_area
          MV-Schmidt Carsten
          MV-Schmidt Carsten
          MV-Schmidt Carsten
          MV-Schmidt Carsten
          MV-Schmidt Carsten

so i used split function

d[&#39;SALES_AREA&#39;].str.split(&#39;-&#39;)[1]

getting result

[&#39;MV&#39;, &#39;Schmidt Carsten&#39;]

when i try this

d[&#39;SALES_AREA&#39;].str.split(&#39;-&#39;).str[1]

i got correct result

         Schmidt Carsten
         Schmidt Carsten
         Schmidt Carsten
         Schmidt Carsten

i got results. but i am not able to understand why i am getting result when i use str[0] instead of [0]

答案1

得分: 1

str属性生成了一个类型为StringMethods的代理对象，它包装了d['Sales_AREA']。然后split方法不是在列本身上调用的，而是在该列中的每个str值上调用的。然后，结果以一个具有相同形状的新Series对象返回。

df[...].str[0]之所以有效是因为...[0]调用了对象的__getitem__方法，所以df[...].str[0]等同于df[...].str.__getitem__(0)。

英文:

The str attribute produces a proxy of type StringMethods that wraps d['Sales_AREA']. The split method then is not invoked on the column itself, but on each str value in that column. The results are then returned in a new Series object with the same shape.

df[...].str[0] works because ...[0] invokes the __getitem__ method of the object, so df[...].str[0] is the same as df[...].str.__getitem__(0).

答案2

得分: 0

当你使用 d['SALES_AREA'].str.split('-')[1] 时，你实际上是在执行以下操作：

d = pd.DataFrame([&#39;MV-Schmidt Carsten&#39;, &#39;MV-Schmidt Carsten&#39;, &#39;MV-Schmidt Carsten&#39;], columns=[&#39;Sales_area&#39;])
d[&#39;Split&#39;] = d[&#39;Sales_area&#39;].str.split(&#39;-&#39;)
print(d)
           Sales_area              Split
0  MV-Schmidt Carsten  [MV, Schmidt Carsten]
1  MV-Schmidt Carsten  [MV, Schmidt Carsten]
2  MV-Schmidt Carsten  [MV, Schmidt Carsten]

你的第一行代码相当于 d['Split'][1]；它返回了应用 split 函数后数据框中的第二个元素，即拆分值的列表。

[&#39;MV&#39;, &#39;Schmidt Carsten&#39;]

通过 d['SALES_AREA'].str.split('-').str[1]，你实际上是返回了由 split() 函数创建的列表中的第二个元素。你也可以像这样索引它 d['Sales_area'].str.split('-')[1][1] 以返回特定行的第二个值。

英文:

When you use d['SALES_AREA'].str.split('-')[1] you are effectively doing this:

d = pd.DataFrame([&#39;MV-Schmidt Carsten&#39;, &#39;MV-Schmidt Carsten&#39;, &#39;MV-Schmidt Carsten&#39;], columns=[&#39;Sales_area&#39;])
d[&#39;Split&#39;] = d[&#39;Sales_area&#39;].str.split(&#39;-&#39;)
print(d)
           Sales_area                  Split
0  MV-Schmidt Carsten  [MV, Schmidt Carsten]
1  MV-Schmidt Carsten  [MV, Schmidt Carsten]
2  MV-Schmidt Carsten  [MV, Schmidt Carsten]

Your first line of code is the equivalent of d['Split'][1]; It returns the second element of the dataframe after applying the split function, i.e. the list of the two split values.

[&#39;MV&#39;, &#39;Schmidt Carsten&#39;]

By doing d['SALES_AREA'].str.split('-').str[1] you are instead returning the second element of the list created by the split() function. You can also index this like d['Sales_area'].str.split('-')[1][1] to return the second value of a specific row

答案3

得分: 0

在pandas中，当你在一个列上使用“str.split()”函数时，它会根据指定的分隔符将值拆分为一个子字符串列表，创建一个新列，其中每个值都是一个列表。

要从SALES_AREA列的值开头移除'MV-'，

d['SALES_AREA'] = d['SALES_AREA'].str.replace('MV-', '')

这会将'MV-'替换为空字符串。

或者你可以使用

.strip('MV-')

这将从字符串的开头和结尾移除MV-。

英文:

In pandas, when you use the "str.split()" function on a column, it splits the values into a list of substrings based on the delimiter specified making a new column where each value is a list.

To remove the 'MV-' from the beginning of the values in the SALES_AREA column

d[&#39;SALES_AREA&#39;] = d[&#39;SALES_AREA&#39;].str.replace(&#39;MV-&#39;, &#39;&#39;)

This replaces 'MV-' with an empty string.

Or you can use

.strip(&#39;MV-&#39;)

This will remove MV- from the start and end of string.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

“.str[0]” 在 pandas 数据框中的用途是什么？

问题

答案1

答案2

答案3

寻找在Python中具有两个变量的函数的优化参数

浮点精度问题在计算π时发生。

Multiple dispatch – "function requires at least 1 positional argument"

根据另一列中的分组和条件填充列。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。