Pandas DF: 创建新列,通过删除现有列的最后一个单词。

huangapple go评论97阅读模式
英文:

Pandas DF: Create New Col by removing last word from of existing column

问题

这似乎是一个很简单的目标,但我可能想得太多,但我卡住了。关于我应该做什么的建议将不胜感激。

以下是正确的翻译部分:

  • "This should be easy, but I'm stumped." -> "这应该很容易,但我陷入了困境。"
  • "I have a df that includes a column of PLACENAMES. Some of these have multiple word names:" -> "我有一个包含PLACENAMES列的数据框。其中一些具有多个单词的名称:"
  • "All I want to do is to create a new column in my df that has just the name, without the "county" word:" -> "我想做的只是在我的数据框中创建一个新列,只包含名称,不包括“county”这个词:"
  • "1. Works - splits the names into a list ['St.','Louis','County']" -> "1. 有效 - 将名称分割成一个列表['St.','Louis','County']"
  • "2. The list splice is ignored, resulting in the same list ['St.','Louis','County'] rather than ['St.','Louis']" -> "2. 忽略了列表切片,导致相同的列表['St.','Louis','County'],而不是['St.','Louis']"
  • "3. Raises a ValueError: Length of values (2) does not match length of index (41414)" -> "3. 引发了一个ValueError错误:值的长度(2)与索引的长度(41414)不匹配"
  • "4. Raises a TypeError: sequence item 0: expected str instance, list found" -> "4. 引发了一个TypeError错误:序列项 0:期望 str 实例,但找到了列表"
  • "This also raises a TypeError: sequence item 0: expected str instance, list found" -> "这也引发了一个TypeError错误:序列项 0:期望 str 实例,但找到了列表"

希望这有助于您解决问题。

英文:

This should be easy, but I'm stumped.
I have a df that includes a column of PLACENAMES. Some of these have multiple word names:

  1. Able County
  2. Baker County
  3. Charlie County
  4. St. Louis County

All I want to do is to create a new column in my df that has just the name, without the "county" word:

  1. Able
  2. Baker
  3. Charlie
  4. St. Louis

I've tried a variety of things:

  1. 1. places['name_split'] = places['PLACENAME'].str.split()
  2. 2. places['name_split'] = places['PLACENAME'].str.split()[:-1]
  3. 3. places['name_split'] = places['PLACENAME'].str.rsplit(' ',1)[0]
  4. 4. places = places.assign(name_split = lambda x: ' '.join(x['PLACENAME].str.split()[:-1]))
  1. Works - splits the names into a list ['St.','Louis','County']
  2. The list splice is ignored, resulting in the same list ['St.','Louis','County'] rather than ['St.','Louis']
  3. Raises a ValueError: Length of values (2) does not match length of index (41414)
  4. Raises a TypeError: sequence item 0: expected str instance, list found

I've also defined a function and called it with .assign():

  1. def processField(namelist):
  2. words = namelist[:-1]
  3. name = ' '.join(words)
  4. return name
  5. places = places.assign(name_split = lambda x: processField(x['PLACENAME]))

This also raises a TypeError: sequence item 0: expected str instance, list found

This seems to be a very simple goal and I've probably overthought it, but I'm just stumped. Suggestions about what I should be doing would be deeply appreciated.

答案1

得分: 1

应用Series.str.rpartition函数:

  1. places['name_split'] = places['PLACENAME'].str.rpartition()[0]
英文:

Apply Series.str.rpartition function:

  1. places['name_split'] = places['PLACENAME'].str.rpartition()[0]

答案2

得分: 1

使用 str.replace 来移除最后一个单词和前面的空格:

  1. places['new'] = place['PLACENAME'].str.replace(r'\s*\w+$', '', regex=True)
  2. # 或者
  3. places['new'] = place['PLACENAME'].str.replace(r'\s*\S+$', '', regex=True)
  4. # 或者,只匹配 'County'
  5. places['new'] = place['PLACENAME'].str.replace(r'\s*County$', '', regex=True)

输出:

  1. PLACENAME new
  2. 0 Able County Able
  3. 1 Baker County Baker
  4. 2 Charlie County Charlie
  5. 3 St. Louis County St. Louis

正则表达式演示

英文:

Use str.replace to remove the last word and the preceding spaces:

  1. places['new'] = place['PLACENAME'].str.replace(r'\s*\w+$', '', regex=True)
  2. # or
  3. places['new'] = place['PLACENAME'].str.replace(r'\s*\S+$', '', regex=True)
  4. # or, only match 'County'
  5. places['new'] = place['PLACENAME'].str.replace(r'\s*County$', '', regex=True)

Output:

  1. PLACENAME new
  2. 0 Able County Able
  3. 1 Baker County Baker
  4. 2 Charlie County Charlie
  5. 3 St. Louis County St. Louis

regex demo

huangapple
  • 本文由 发表于 2023年2月14日 02:06:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/75439661.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定