英文:
what is the use of .str[0] in pandas dataframe
问题
我有一个名为df的数据框,其中有一个sales_area列,我需要删除列值开头的MV-。
销售区域
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten
所以我使用了分割函数
d['SALES_AREA'].str.split('-')[1]
获得结果
['MV', 'Schmidt Carsten']
当我尝试这样做
d['SALES_AREA'].str.split('-').str[1]
我得到了正确的结果
Schmidt Carsten
Schmidt Carsten
Schmidt Carsten
Schmidt Carsten
我得到了结果。但我不明白为什么在使用str[0]而不是[0]时会得到结果。
英文:
I am having a dataframe df which has sales_area column i have to remove the MV- from the beginning of the columns_value
Sales_area
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten
MV-Schmidt Carsten
so i used split function
d['SALES_AREA'].str.split('-')[1]
getting result
['MV', 'Schmidt Carsten']
when i try this
d['SALES_AREA'].str.split('-').str[1]
i got correct result
Schmidt Carsten
Schmidt Carsten
Schmidt Carsten
Schmidt Carsten
i got results. but i am not able to understand why i am getting result when i use str[0] instead of [0]
答案1
得分: 1
str
属性生成了一个类型为StringMethods
的代理对象,它包装了d['Sales_AREA']
。然后split
方法不是在列本身上调用的,而是在该列中的每个str
值上调用的。然后,结果以一个具有相同形状的新Series
对象返回。
df[...].str[0]
之所以有效是因为...[0]
调用了对象的__getitem__
方法,所以df[...].str[0]
等同于df[...].str.__getitem__(0)
。
英文:
The str
attribute produces a proxy of type StringMethods
that wraps d['Sales_AREA']
. The split
method then is not invoked on the column itself, but on each str
value in that column. The results are then returned in a new Series
object with the same shape.
df[...].str[0]
works because ...[0]
invokes the __getitem__
method of the object, so df[...].str[0]
is the same as df[...].str.__getitem__(0)
.
答案2
得分: 0
当你使用 d['SALES_AREA'].str.split('-')[1]
时,你实际上是在执行以下操作:
d = pd.DataFrame(['MV-Schmidt Carsten', 'MV-Schmidt Carsten', 'MV-Schmidt Carsten'], columns=['Sales_area'])
d['Split'] = d['Sales_area'].str.split('-')
print(d)
Sales_area Split
0 MV-Schmidt Carsten [MV, Schmidt Carsten]
1 MV-Schmidt Carsten [MV, Schmidt Carsten]
2 MV-Schmidt Carsten [MV, Schmidt Carsten]
你的第一行代码相当于 d['Split'][1]
;它返回了应用 split 函数后数据框中的第二个元素,即拆分值的列表。
['MV', 'Schmidt Carsten']
通过 d['SALES_AREA'].str.split('-').str[1]
,你实际上是返回了由 split() 函数创建的列表中的第二个元素。你也可以像这样索引它 d['Sales_area'].str.split('-')[1][1]
以返回特定行的第二个值。
英文:
When you use d['SALES_AREA'].str.split('-')[1]
you are effectively doing this:
d = pd.DataFrame(['MV-Schmidt Carsten', 'MV-Schmidt Carsten', 'MV-Schmidt Carsten'], columns=['Sales_area'])
d['Split'] = d['Sales_area'].str.split('-')
print(d)
Sales_area Split
0 MV-Schmidt Carsten [MV, Schmidt Carsten]
1 MV-Schmidt Carsten [MV, Schmidt Carsten]
2 MV-Schmidt Carsten [MV, Schmidt Carsten]
Your first line of code is the equivalent of d['Split'][1]
; It returns the second element of the dataframe after applying the split function, i.e. the list of the two split values.
['MV', 'Schmidt Carsten']
By doing d['SALES_AREA'].str.split('-').str[1]
you are instead returning the second element of the list created by the split() function. You can also index this like d['Sales_area'].str.split('-')[1][1]
to return the second value of a specific row
答案3
得分: 0
在pandas中,当你在一个列上使用“str.split()”函数时,它会根据指定的分隔符将值拆分为一个子字符串列表,创建一个新列,其中每个值都是一个列表。
要从SALES_AREA列的值开头移除'MV-',
d['SALES_AREA'] = d['SALES_AREA'].str.replace('MV-', '')
这会将'MV-'替换为空字符串。
或者你可以使用
.strip('MV-')
这将从字符串的开头和结尾移除MV-。
英文:
In pandas, when you use the "str.split()" function on a column, it splits the values into a list of substrings based on the delimiter specified making a new column where each value is a list.
To remove the 'MV-' from the beginning of the values in the SALES_AREA column
d['SALES_AREA'] = d['SALES_AREA'].str.replace('MV-', '')
This replaces 'MV-' with an empty string.
Or you can use
.strip('MV-')
This will remove MV- from the start and end of string.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论