2023年8月4日 23:01:09go评论101阅读模式

英文:

Why do plus and minus have different promotion rules although the results are the same?

问题

I wonder why a - b and a + (-b) give the same result but in different types in numpy:

import numpy as np
minuend = np.array(1, dtype=np.int64)
subtrahend = 1 << 63
result_minus = minuend - subtrahend
result_plus = minuend + (-subtrahend)
print(result_minus == result_plus)  # True
print(type(result_minus))  # float64
print(type(result_plus))  # int64

为什么会这样，我可以在哪里阅读相关信息？

英文:

I wonder why a - b and a + (-b) give the same result but in different types in numpy:

import numpy as np
minuend = np.array(1, dtype=np.int64)
subtrahend = 1 &lt;&lt; 63
result_minus = minuend - subtrahend
result_plus = minuend + (-subtrahend)
print(result_minus == result_plus)  # True
print(type(result_minus))  # float64
print(type(result_plus))  # int64

Why is that, and where can I read about it?

答案1

得分: 12

这里的要点是，1 << 63 不能用 int64 类型表示，但 -(1 << 63) 可以。这是一个病态案例，源于有符号整数在二进制中的表示（C2表示法）。

一方面，NumPy将 subtrahend（1 << 63）转换为 uint64 值，因为 int64 太小无法容纳该值。

另一方面，CPython计算了 -subtrahend，因此得到一个纯Python整数，其中包含 -(1 << 63)。然后，NumPy将该值转换为 int64 类型（因为该类型足够大以容纳该值）。

NumPy在内部仅处理相同类型的数组。涉及不同类型的二进制操作会导致数组升级（主要继承自C语言，因为NumPy是用C编写的）：NumPy将输入数组的类型转换为目标二进制操作有意义且安全（没有溢出），但在病态情况下可能会损失精度。

在这种情况下，当执行减法时，NumPy选择将最终数组存储为 float64 数组，因为同时对 uint64 和 int64 数组进行二进制操作可能会导致溢出（无符号整数太大而无法存储在有符号整数中，负有符号整数无法表示为无符号整数）。当执行加法时，两个数组/值的类型相同（即 int64），因此不需要任何升级，结果数组的类型是 int64。

以下是一种查看方法：

>>> np.int64(1 << 63)
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
Cell In [7], line 1
----> 1 np.int64(1 << 63)
OverflowError: Python int too large to convert to C long
>>> np.int64(-(1 << 63))
-9223372036854775808
>>> np.array(subtrahend).dtype
dtype('uint64')
# NumPy首先将它们都转换为float64数组，以避免由于混合的uint64+int64整数类型而导致溢出
>>> (np.array(subtrahend) + minuend).dtype
dtype('float64')
>>> np.array(-subtrahend).dtype
dtype('int64')

英文:

The point is that 1 << 63 cannot be represented using a int64 type, but -(1 << 63) can. This is a pathological case coming from how signed integers are represented in binary (C2 representation).

On one hand, Numpy converts subtrahend (1 << 63) to a uint64 value because int64 is too small to hold the value.

On another hand, -subtrahend is computed by CPython so it results in a pure-Python integer containing -(1 << 63). The value is then converted by Numpy to a int64 value (because this type is large enough to hold the value).

Numpy only operates on arrays of the same types internally. Binary operations involving different types result in array promotions (inherited from the C language mainly because Numpy is written in C) : Numpy converts the type of the input arrays so the target binary operation makes sense and is also safe (no overflow, at the expense of a possible loss of precision in pathological cases).

In this case, when the subtraction is performed, Numpy chooses to store the final array in a float64 array because a binary operation on both uint64 and int64 array is likely to cause overflows (unsigned integers are too big to be stored in signed ones and negative signed integer cannot be represented in unsigned ones). When the addition is performed, the two array/values are of the same type (i.e. int64), so there is no need for any promotion and the resulting array is of type int64.

Here is a way to see that:

&gt;&gt;&gt; np.int64(1 &lt;&lt; 63)
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
Cell In [7], line 1
----&gt; 1 np.int64(1 &lt;&lt; 63)
OverflowError: Python int too large to convert to C long
&gt;&gt;&gt; np.int64(-(1 &lt;&lt; 63))
-9223372036854775808
&gt;&gt;&gt; np.array(subtrahend).dtype
dtype(&#39;uint64&#39;)
# Numpy first convert both to float64 arrays to avoid overflows in this specific case due to mixed uint64+int64 integer types
&gt;&gt;&gt; (np.array(subtrahend) + minuend).dtype
dtype(&#39;float64&#39;)
&gt;&gt;&gt; np.array(-subtrahend).dtype
dtype(&#39;int64&#39;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么加号和减号在推广规则上不同，尽管结果相同？

问题

答案1

“list”对象没有属性”shape”。

在NumPy中出现的奇怪索引问题消耗了太多内存。

2D数组，检查哪些行与一个1D数组相等

Python: For a 2D array, sum the 2nd col of the non-unique elements in first col?

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。