英文:
What am I doing wrong in this pandas dataframe drop operation to result in a KeyError?
问题
当运行这行代码时:
df = df.drop(df[(len(df.Sentences) > 40)].index)
它返回:
KeyError: True
我原本希望它能删除所有句子长度超过40个字符的行。
希望有更好的方法来实现这个目标。
英文:
When running this line of code:
df = df.drop(df[(len(df.Sentences) > 40)].index)
It returns:
KeyError: True
I was expecting it to remove all rows with Sentence length of more than 40 characters.
Hopefully there's a better way to do this.
答案1
得分: 0
-
len(df.Sentences)
返回你在 Sentences 列中的元素数量。你可以通过查看这个语句的结果来验证这一点。 -
很可能你有超过40个元素,所以
len(df.Sentences)>40
返回 True,然后你将其用作索引 [],这会导致关键错误。
所以问题在于获取各个元素的长度。让我们尝试 df.Sentences.str.len()
,这应该可以解决问题。
以下是完整示例:
>>> df = pd.DataFrame([['aaa', 1, 2, 3], ['bbbbb', 4, 5, 6], ['ccc', 7, 8, 9]], columns=['A', 'B', 'C', 'D'])
>>> df
A B C D
0 aaa 1 2 3
1 bbbbb 4 5 6
2 ccc 7 8 9
>>> df.drop(df[df.A.str.len() > 4].index)
A B C D
0 aaa 1 2 3
2 ccc 7 8 9
英文:
Let's break down your statement:
-
len(df.Sentences)
returns the number of elements you have in the Sentences column. You can verify that but just looking at the result of this statement. -
Most likely you have more than 40 elements, so
len(df.Sentences)>40
returns True which you then use as the index [] which returns the key error.
So the problem is getting the lengths of the individual elements. Let's try df.Sentences.str.len()
which should do the trick.
Here is a complete sample:
>>> df = pd.DataFrame([['aaa',1,2,3], ['bbbbb',4,5,6],['ccc',7,8,9]], columns=['A', 'B', 'C', 'D'])
>>> df
A B C D
0 aaa 1 2 3
1 bbbbb 4 5 6
2 ccc 7 8 9
>>> df.drop(df[df.A.str.len()>4].index)
A B C D
0 aaa 1 2 3
2 ccc 7 8 9
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论