为什么加号和减号在推广规则上不同,尽管结果相同?

huangapple go评论101阅读模式
英文:

Why do plus and minus have different promotion rules although the results are the same?

问题

I wonder why a - b and a + (-b) give the same result but in different types in numpy:

  1. import numpy as np
  2. minuend = np.array(1, dtype=np.int64)
  3. subtrahend = 1 << 63
  4. result_minus = minuend - subtrahend
  5. result_plus = minuend + (-subtrahend)
  6. print(result_minus == result_plus) # True
  7. print(type(result_minus)) # float64
  8. print(type(result_plus)) # int64

为什么会这样,我可以在哪里阅读相关信息?

英文:

I wonder why a - b and a + (-b) give the same result but in different types in numpy:

  1. import numpy as np
  2. minuend = np.array(1, dtype=np.int64)
  3. subtrahend = 1 &lt;&lt; 63
  4. result_minus = minuend - subtrahend
  5. result_plus = minuend + (-subtrahend)
  6. print(result_minus == result_plus) # True
  7. print(type(result_minus)) # float64
  8. print(type(result_plus)) # int64

Why is that, and where can I read about it?

答案1

得分: 12

这里的要点是,1 << 63 不能用 int64 类型表示,但 -(1 << 63) 可以。这是一个病态案例,源于有符号整数在二进制中的表示(C2表示法)。

一方面,NumPy将 subtrahend1 << 63)转换为 uint64 值,因为 int64 太小无法容纳该值。

另一方面,CPython计算了 -subtrahend,因此得到一个纯Python整数,其中包含 -(1 << 63)。然后,NumPy将该值转换为 int64 类型(因为该类型足够大以容纳该值)。

NumPy在内部仅处理相同类型的数组。涉及不同类型的二进制操作会导致数组升级(主要继承自C语言,因为NumPy是用C编写的):NumPy将输入数组的类型转换为目标二进制操作有意义且安全(没有溢出),但在病态情况下可能会损失精度。

在这种情况下,当执行减法时,NumPy选择将最终数组存储为 float64 数组,因为同时对 uint64int64 数组进行二进制操作可能会导致溢出(无符号整数太大而无法存储在有符号整数中,负有符号整数无法表示为无符号整数)。当执行加法时,两个数组/值的类型相同(即 int64),因此不需要任何升级,结果数组的类型是 int64

以下是一种查看方法:

  1. >>> np.int64(1 << 63)
  2. ---------------------------------------------------------------------------
  3. OverflowError Traceback (most recent call last)
  4. Cell In [7], line 1
  5. ----> 1 np.int64(1 << 63)
  6. OverflowError: Python int too large to convert to C long
  7. >>> np.int64(-(1 << 63))
  8. -9223372036854775808
  9. >>> np.array(subtrahend).dtype
  10. dtype('uint64')
  11. # NumPy首先将它们都转换为float64数组,以避免由于混合的uint64+int64整数类型而导致溢出
  12. >>> (np.array(subtrahend) + minuend).dtype
  13. dtype('float64')
  14. >>> np.array(-subtrahend).dtype
  15. dtype('int64')
英文:

The point is that 1 &lt;&lt; 63 cannot be represented using a int64 type, but -(1 &lt;&lt; 63) can. This is a pathological case coming from how signed integers are represented in binary (C2 representation).

On one hand, Numpy converts subtrahend (1 &lt;&lt; 63) to a uint64 value because int64 is too small to hold the value.

On another hand, -subtrahend is computed by CPython so it results in a pure-Python integer containing -(1 &lt;&lt; 63). The value is then converted by Numpy to a int64 value (because this type is large enough to hold the value).

Numpy only operates on arrays of the same types internally. Binary operations involving different types result in array promotions (inherited from the C language mainly because Numpy is written in C) : Numpy converts the type of the input arrays so the target binary operation makes sense and is also safe (no overflow, at the expense of a possible loss of precision in pathological cases).

In this case, when the subtraction is performed, Numpy chooses to store the final array in a float64 array because a binary operation on both uint64 and int64 array is likely to cause overflows (unsigned integers are too big to be stored in signed ones and negative signed integer cannot be represented in unsigned ones). When the addition is performed, the two array/values are of the same type (i.e. int64), so there is no need for any promotion and the resulting array is of type int64.

Here is a way to see that:

  1. &gt;&gt;&gt; np.int64(1 &lt;&lt; 63)
  2. ---------------------------------------------------------------------------
  3. OverflowError Traceback (most recent call last)
  4. Cell In [7], line 1
  5. ----&gt; 1 np.int64(1 &lt;&lt; 63)
  6. OverflowError: Python int too large to convert to C long
  7. &gt;&gt;&gt; np.int64(-(1 &lt;&lt; 63))
  8. -9223372036854775808
  9. &gt;&gt;&gt; np.array(subtrahend).dtype
  10. dtype(&#39;uint64&#39;)
  11. # Numpy first convert both to float64 arrays to avoid overflows in this specific case due to mixed uint64+int64 integer types
  12. &gt;&gt;&gt; (np.array(subtrahend) + minuend).dtype
  13. dtype(&#39;float64&#39;)
  14. &gt;&gt;&gt; np.array(-subtrahend).dtype
  15. dtype(&#39;int64&#39;)

huangapple
  • 本文由 发表于 2023年8月4日 23:01:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/76837108.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定