英文:
Why is neq-comparison inconsistent in pandas?
问题
两个pandas数组的比较结果与逐元素比较的结果不相同。以下是一个示例:
import numpy as np
import pandas as pd
floats_with_None = np.array([None, 2.3, 1.4])
s_None = pd.Series(floats_with_None)
(s_None != s_None)[0] # 得到True
s_None[0] != s_None[0] # 得到False
有人能解释这个现象的原因吗?
英文:
A comparison of two pandas arrays does not give the same result as performing the comparison element-wise. Here is an example:
import numpy as np
import pandas as pd
floats_with_None = np.array([None,2.3,1.4])
s_None = pd.Series(floats_with_None)
(s_None != s_None)[0] # gives True
s_None[0] != s_None[0] # gives False
Can someone explain the cause of this?
答案1
得分: 3
Pandas将None
视为“缺失数据”标记,并在比较中对其进行特殊处理。就像浮点数的NaN
或pandas.NaT
一样,pandas将None
视为与任何东西都不相等,包括自身。
但请注意,None
仍然只是一个普通的Python对象,在Pandas之外。Pandas只能在由Pandas实现的比较中特殊处理None
。当您执行以下操作时:
s_None[0] != s_None[0]
您正在使用None
的!=
实现,这只是从object
继承的默认实现。此默认!=
实现认为所有对象都等于自身。
这种差异在文档中有提到:
必须注意,在Python(和NumPy)中,
nan
不相等,但None
相等。请注意,pandas/NumPy利用了np.nan != np.nan
的事实,并将None
视为np.nan
一样对待。
尽管“请注意pandas/NumPy...”部分可能应该只说“请注意pandas...”。
英文:
Pandas treats None
as a "missing data" marker, and special-cases it in comparisons. Like floating-point NaN
or pandas.NaT
, pandas will treat None
as unequal to everything, including itself.
None
is still just an ordinary Python object, outside of Pandas's control, though. Pandas can only special-case None
in comparisons implemented by Pandas. When you do
s_None[0] != s_None[0]
you're using None
's implementation of !=
, which is just a default implementation inherited from object
. This default implementation of !=
considers all objects equal to themselves.
This discrepancy is mentioned in the docs:
> One has to be mindful that in Python (and NumPy), the nan's don’t compare equal, but None's do. Note that pandas/NumPy uses the fact that np.nan != np.nan, and treats None like np.nan.
although the "Note that pandas/NumPy..." part should probably just say "Note that pandas..."
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论