2023年8月10日 17:29:39go评论128阅读模式

英文:

Why is neq-comparison inconsistent in pandas?

问题

两个pandas数组的比较结果与逐元素比较的结果不相同。以下是一个示例：

import numpy as np
import pandas as pd
floats_with_None = np.array([None, 2.3, 1.4])
s_None = pd.Series(floats_with_None)
(s_None != s_None)[0]  # 得到True
s_None[0] != s_None[0]  # 得到False

有人能解释这个现象的原因吗？

英文:

A comparison of two pandas arrays does not give the same result as performing the comparison element-wise. Here is an example:

import numpy as np
import pandas as pd
floats_with_None = np.array([None,2.3,1.4])
s_None = pd.Series(floats_with_None)
(s_None != s_None)[0] # gives True
s_None[0] != s_None[0] # gives False

Can someone explain the cause of this?

答案1

得分: 3

Pandas将None视为“缺失数据”标记，并在比较中对其进行特殊处理。就像浮点数的NaN或pandas.NaT一样，pandas将None视为与任何东西都不相等，包括自身。

但请注意，None仍然只是一个普通的Python对象，在Pandas之外。Pandas只能在由Pandas实现的比较中特殊处理None。当您执行以下操作时：

s_None[0] != s_None[0]

您正在使用None的!=实现，这只是从object继承的默认实现。此默认!=实现认为所有对象都等于自身。

这种差异在文档中有提到：

必须注意，在Python（和NumPy）中，nan不相等，但None相等。请注意，pandas/NumPy利用了np.nan != np.nan的事实，并将None视为np.nan一样对待。

尽管“请注意pandas/NumPy...”部分可能应该只说“请注意pandas...”。

英文:

Pandas treats None as a "missing data" marker, and special-cases it in comparisons. Like floating-point NaN or pandas.NaT, pandas will treat None as unequal to everything, including itself.

None is still just an ordinary Python object, outside of Pandas's control, though. Pandas can only special-case None in comparisons implemented by Pandas. When you do

s_None[0] != s_None[0]

you're using None's implementation of !=, which is just a default implementation inherited from object. This default implementation of != considers all objects equal to themselves.

This discrepancy is mentioned in the docs:

> One has to be mindful that in Python (and NumPy), the nan's don’t compare equal, but None's do. Note that pandas/NumPy uses the fact that np.nan != np.nan, and treats None like np.nan.

although the "Note that pandas/NumPy..." part should probably just say "Note that pandas..."

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么在 pandas 中进行 “neq” 比较不一致？

问题

答案1

尝试使用pytube下载时出现问题

在Airflow中特定的DAGs收到许多警告。

有没有办法在wxGlade中导入Python文件而不需要wxg备份？

我无法让Selenium代码在已存在的浏览器中打开一个选项卡。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。