英文:
Why does numpy.vectorize give a warning about an invalid value when using uncertainties?
问题
以下是代码的翻译部分,不包括问题的回答:
使用 Python 3.10、numpy 1.23.5 和 uncertainties 3.1.7(在 Linux 上;具体使用 conda-forge 在 Fedora 37 上的软件包),以下代码:
import numpy as np
from uncertainties.core import Variable
x = np.array([1.0, 2.0])
y = np.array([np.nan, np.nan], dtype=float)
func = np.vectorize(lambda x, y: Variable(x, y), otypes=[object])
func(x, y)
生成以下警告:
/lib/python3.10/site-packages/numpy/lib/function_base.py:2411: RuntimeWarning: invalid value encountered in <lambda> (vectorized)
outputs = ufunc(*inputs)
使用 `warnings.simplefilter("error")`,我得到以下回溯:
Traceback (most recent call last):
File "/test.py", line 12, in <module>
z = func(x, y)
File "/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2328, in __call__
return self._vectorize_call(func=func, args=vargs)
File "/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2411, in _vectorize_call
outputs = ufunc(*inputs)
RuntimeWarning: invalid value encountered in <lambda> (vectorized)
如果我将输入更改为像这样使用所有有限浮点数:
import numpy as np
from uncertainties.core import Variable
x = np.array([1.0, 2.0])
y = np.array([1.0, 1.0], dtype=float)
func = np.vectorize(lambda x, y: Variable(x, y), otypes=[object])
func(x, y)
或者更改代码以使用自定义类而不是 Variable
,像这样:
import numpy as np
class Variable:
def __init__(self, x, y):
self.x = x
self.y = y
x = np.array([1.0, 2.0])
y = np.array([np.nan, np.nan], dtype=float)
func = np.vectorize(lambda x, y: Variable(x, y), otypes=[object])
func(x, y)
那么就不会出现警告。
是什么导致 numpy 发出这个警告?它来自 pdb 无法步进的编译代码。我在 uncertainties 的 Variable
代码中没有看到任何应该引发错误或与我的第三个示例中的标准 Python 类不同的内容。
我注意到这段代码:
import numpy as np
from uncertainties.core import Variable
x = np.array([1.0, 2.0])
y = np.array([np.nan, np.nan], dtype=float)
[Variable(ix, iy) for ix, iy in zip(x, y)]
不会产生错误,因此将这些参数传递给 Variable
实际上没有问题。似乎是 numpy 检查了传递给矢量化函数的参数的类型或维度的某些内容,并检测到与其预期不匹配的内容。
在这里,我试图提供了对 numpy.vectorize
的简单调用。我遇到这个问题的实际情况是在 uncertainties.unumpy.uarray 中,它类似于我的示例中使用了 numpy.vectorize
。
<details>
<summary>英文:</summary>
With Python 3.10, numpy 1.23.5, and [uncertainties][1] 3.1.7 (on Linux; specifically using packages from conda-forge on Fedora 37), the following code:
```python
import numpy as np
from uncertainties.core import Variable
x = np.array([1.0, 2.0])
y = np.array([np.nan, np.nan], dtype=float)
func = np.vectorize(lambda x, y: Variable(x, y), otypes=[object])
func(x, y)
produces:
/lib/python3.10/site-packages/numpy/lib/function_base.py:2411: RuntimeWarning: invalid value encountered in <lambda> (vectorized)
outputs = ufunc(*inputs)
Using warnings.simplefilter("error")
, I get the following traceback:
Traceback (most recent call last):
File "/test.py", line 12, in <module>
z = func(x, y)
File "/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2328, in __call__
return self._vectorize_call(func=func, args=vargs)
File "/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2411, in _vectorize_call
outputs = ufunc(*inputs)
RuntimeWarning: invalid value encountered in <lambda> (vectorized)
If I change the input to use all finite floats like this:
import numpy as np
from uncertainties.core import Variable
x = np.array([1.0, 2.0])
y = np.array([1.0, 1.0], dtype=float)
func = np.vectorize(lambda x, y: Variable(x, y), otypes=[object])
func(x, y)
or change the code to use a custom class instead of Variable
like this:
import numpy as np
class Variable:
def __init__(self, x, y):
self.x = x
self.y = y
x = np.array([1.0, 2.0])
y = np.array([np.nan, np.nan], dtype=float)
func = np.vectorize(lambda x, y: Variable(x, y), otypes=[object])
func(x, y)
then no warning is issued.
What is causing this warning from numpy? It comes from compiled code that pdb can not step into. I don't see anything in the uncertainties code for Variable
that should error or otherwise be different from a standard Python class like my third example.
I note that this code:
import numpy as np
from uncertainties.core import Variable
x = np.array([1.0, 2.0])
y = np.array([np.nan, np.nan], dtype=float)
[Variable(ix, iy) for ix, iy in zip(x, y)]
produces no error, so there is not actually a problem with passing these arguments to Variable
. It seems to be that numpy is examining something about the types or dimensions of the arguments to the vectorized function and detects something that does not match what it expects.
Here I tried to provide a simple invocation of numpy.vectorize
. The actual case where I encountered this was with uncertainties.unumpy.uarray which uses numpy.vectorize
similarly to my example.
答案1
得分: 1
查看不确定性源代码。您会发现它通过语句 std_dev < 0
来检查提供的不确定性是否为负数。
在您提供的代码中,不确定性 (std_dev
) 是 NaN。这通常是可以的。根据 IEEE745,运行 np.nan < 0
应该始终返回 False
,而不应该出现错误。
但是,在 np.fromfunc
内部执行此比较会引发警告。而 np.vectorize
在底层调用了 fromfunc
。为什么它会失败?我不清楚。我能制作的最简单的复现代码如下:
import numpy as np
np.seterr(all='raise')
func = np.frompyfunc(lambda x: x < 0, nin=1, nout=1)
func(1) # 正常
func(np.nan) # 不正常
我建议跳过向量化,使用列表推导的方式。这将提供合理的速度,我想,而且可以解决此问题。
英文:
Look into the source of uncertanities. You will find that it checks if the supplied uncertanity is negative, by the statement std_dev < 0
.
Your supplied uncertanity (std_dev
in the code) is NaN.This is normally fine. Running np.nan < 0
should always be False
according to IEEE745, and is not supposed to be an error.
But when doing that comparison inside a np.fromfunc
raises a warning. And np.vectorize
calls fromfunc
under the hood. Why it fails? I have no idea. Simplest reproduction I could make:
import numpy as np
np.seterr(all='raise')
func = np.frompyfunc(lambda x: x<0,nin=1,nout=1)
func(1) # is ok
func(np.nan) # is not ok
I suggest you skip vectorization, and go via a list comprehension. It will give reasonable speed, I suppose, and you work around this bug.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论