2023年5月24日 21:47:10go评论130阅读模式

英文:

What am I doing wrong in this pandas dataframe drop operation to result in a KeyError?

问题

当运行这行代码时：

df = df.drop(df[(len(df.Sentences) &gt; 40)].index)

它返回：

KeyError: True

我原本希望它能删除所有句子长度超过40个字符的行。

希望有更好的方法来实现这个目标。

英文:

When running this line of code:

df = df.drop(df[(len(df.Sentences) &gt; 40)].index)

It returns:

KeyError: True

I was expecting it to remove all rows with Sentence length of more than 40 characters.

Hopefully there's a better way to do this.

答案1

得分: 0

len(df.Sentences) 返回你在 Sentences 列中的元素数量。你可以通过查看这个语句的结果来验证这一点。
很可能你有超过40个元素，所以 len(df.Sentences)>40 返回 True，然后你将其用作索引 []，这会导致关键错误。

所以问题在于获取各个元素的长度。让我们尝试 df.Sentences.str.len()，这应该可以解决问题。

以下是完整示例：

>>> df = pd.DataFrame([['aaa', 1, 2, 3], ['bbbbb', 4, 5, 6], ['ccc', 7, 8, 9]], columns=['A', 'B', 'C', 'D'])
>>> df
       A  B  C  D
0    aaa  1  2  3
1  bbbbb  4  5  6
2    ccc  7  8  9
>>> df.drop(df[df.A.str.len() > 4].index)
     A  B  C  D
0  aaa  1  2  3
2  ccc  7  8  9

英文:

Let's break down your statement:

len(df.Sentences) returns the number of elements you have in the Sentences column. You can verify that but just looking at the result of this statement.
Most likely you have more than 40 elements, so len(df.Sentences)>40 returns True which you then use as the index [] which returns the key error.

So the problem is getting the lengths of the individual elements. Let's try df.Sentences.str.len() which should do the trick.

Here is a complete sample:

&gt;&gt;&gt; df = pd.DataFrame([[&#39;aaa&#39;,1,2,3], [&#39;bbbbb&#39;,4,5,6],[&#39;ccc&#39;,7,8,9]], columns=[&#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;D&#39;])
&gt;&gt;&gt; df
       A  B  C  D
0    aaa  1  2  3
1  bbbbb  4  5  6
2    ccc  7  8  9
&gt;&gt;&gt; df.drop(df[df.A.str.len()&gt;4].index)
     A  B  C  D
0  aaa  1  2  3
2  ccc  7  8  9

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在这个pandas数据帧删除操作中，为什么会导致KeyError错误？

问题

答案1

如何使用for循环插入“季度日期”？

打开一个文件夹中的所有Excel文件，并添加工作表。

Java RequestBypassToServer NoClassDefFoundError

写入多个站点的CSV文件

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。