在这个pandas数据帧删除操作中,为什么会导致KeyError错误?

huangapple go评论80阅读模式
英文:

What am I doing wrong in this pandas dataframe drop operation to result in a KeyError?

问题

当运行这行代码时:

df = df.drop(df[(len(df.Sentences) > 40)].index)

它返回:

KeyError: True

我原本希望它能删除所有句子长度超过40个字符的行。

希望有更好的方法来实现这个目标。

英文:

When running this line of code:

df = df.drop(df[(len(df.Sentences) > 40)].index)

It returns:

KeyError: True

I was expecting it to remove all rows with Sentence length of more than 40 characters.

Hopefully there's a better way to do this.

答案1

得分: 0

  1. len(df.Sentences) 返回你在 Sentences 列中的元素数量。你可以通过查看这个语句的结果来验证这一点。

  2. 很可能你有超过40个元素,所以 len(df.Sentences)>40 返回 True,然后你将其用作索引 [],这会导致关键错误。

所以问题在于获取各个元素的长度。让我们尝试 df.Sentences.str.len(),这应该可以解决问题。

以下是完整示例:

>>> df = pd.DataFrame([['aaa', 1, 2, 3], ['bbbbb', 4, 5, 6], ['ccc', 7, 8, 9]], columns=['A', 'B', 'C', 'D'])
>>> df
       A  B  C  D
0    aaa  1  2  3
1  bbbbb  4  5  6
2    ccc  7  8  9
>>> df.drop(df[df.A.str.len() > 4].index)
     A  B  C  D
0  aaa  1  2  3
2  ccc  7  8  9
英文:

Let's break down your statement:

  1. len(df.Sentences) returns the number of elements you have in the Sentences column. You can verify that but just looking at the result of this statement.

  2. Most likely you have more than 40 elements, so len(df.Sentences)>40 returns True which you then use as the index [] which returns the key error.

So the problem is getting the lengths of the individual elements. Let's try df.Sentences.str.len() which should do the trick.

Here is a complete sample:

>>> df = pd.DataFrame([['aaa',1,2,3], ['bbbbb',4,5,6],['ccc',7,8,9]], columns=['A', 'B', 'C', 'D'])
>>> df
       A  B  C  D
0    aaa  1  2  3
1  bbbbb  4  5  6
2    ccc  7  8  9
>>> df.drop(df[df.A.str.len()>4].index)
     A  B  C  D
0  aaa  1  2  3
2  ccc  7  8  9

huangapple
  • 本文由 发表于 2023年5月24日 21:47:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/76324238.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定