对于一个浮点数 x,没有下溢和上溢,x+x 和 x*2 的结果是相同的吗?

huangapple go评论101阅读模式
英文:

For a float number x, without underflow and overflow, is result of x+x and x*2 identical?

问题

例如,最初我有这样的代码:

function(x, y) {
  let z = x + y;
  .
  .
  .
}

后来我发现 y 必须与 x 相同,并且想要将 x + y 重构为 x * 2,但需要确保在重构之前和之后整个程序的行为是相同的。x + x 是否等同于 x * 2?我不知道 + 和 * 是否使用不同的计算机制,从而导致舍入到不同的结果。

我进行了测试:

for (let i = 0.01; i < 100; i++) {
  if (i + i != i * 2) {
    console.log(i);
    break;
  }
}

对于某些范围的 i 似乎是正确的,但我不知道它是否对所有浮点数都成立。

英文:

For example, originally I have code like this:

function(x,y){
  let z=x+y;
  .
  .
  .
}

later I found y must be the same as x, and want to refactor x+y to x * 2, but need to ensure the behaviour of whole program is the same before and after refactoring. Is x+x identical to x*2? I don't know if + and * uses different calculation mechanisms and hence results in rounding to different results.

I tested:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-js -->

for(let i=0.01;i&lt;100;i++){
  if(i+i!=i*2){
    console.log(i);
    break;
  }
}

<!-- end snippet -->

seems correct for some ranges of i, but don't know if it is true for all float numbers.

答案1

得分: 4

JavaScript是ECMAScript的一种实现,而ECMAScript规范规定使用IEEE 754算术,采用IEEE-754“双精度”(binary64)格式。第5.2.5条款说:“...当应用于数字时,运算符指的是IEEE 754-2019中的相关操作...”

在IEEE 754中,以及在任何合理的浮点数系统中,操作的结果是根据所选的舍入规则(如四舍五入到最接近的偶数,向零舍入,向上舍入或向下舍入)舍入的精确数学结果。IEEE 754-2019 4.3说:

>...除非另有规定,否则每个操作都应该被视为首先产生了一个无限精度和无界范围的中间结果,然后根据本条款中的一个属性将该结果舍入...

由于x+x和2•x具有相同的数学结果,浮点数操作x+x2*x必须产生相同的计算结果。无论哪种舍入规则应用,它们都将具有相同的数学结果,因此计算结果必须相同。

以上涵盖了x是一个数字的情况,包括+∞和−∞。如果x是一个NaNx+x2*x也会产生一个NaN,因此结果再次相同。(请注意,在这种情况下,x+x == 2*x将评估为false,因为NaN不等于任何东西,甚至不等于自己。尽管如此,这两个操作产生相同的结果;如果在x+x的位置使用2*x或反之亦然,程序行为将是相同的。)

英文:

JavaScript is an implementation of ECMAScript, and the ECMAScript specification says IEEE 754 arithmetic is used, with the IEEE-754 “double precision” (binary64) format. Clause 5.2.5 says “… When applied to Numbers, the operators refer to the relevant operations within IEEE 754-2019…”

In IEEE 754, and in any reasonable floating-point system, the result of an operation is the exact mathematical result rounded according to whichever rounding rule is selected (such as round-to-nearest-ties-to-even, round toward zero, round upward, or round downward). IEEE 754-2019 4.3 says:

> … Except where stated otherwise, every operation shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded that result according to one of the attributes in this clause…

Since x+x and 2•x have the same mathematical result, the floating point operations x+x and 2*x must produce the same computed result. Both of them would have the same mathematical result with the same rounding rule applied, so the computed result must be the same.

The above covers cases where x is a number, including +∞ and −∞. If x is a NaN, x+x and 2*x also produce a NaN, so the result is again the same. (Note that, in this case, x+x == 2*x would evaluate as false because a NaN is not equal to anything, not even itself. Nonetheless, the two operations produce the same result; the program behavior would be the same if 2*x were used in place of x+x or vice-versa.)

答案2

得分: 2

AFAICT,这些表达式将始终计算为相同的值。这是由IEEE754标准规定的,该标准指定计算结果应与无限精度进行计算,然后四舍五入为最接近的可表示数字的操作相同。

作为经验性的健全性检查,我运行了以下Python代码:

import numpy as np
a = np.arange(2**24, dtype='uint32')
for i in range(256):
    b = np.frombuffer(a + (i << 24), 'float32')
    np.testing.assert_equal(b+b, b*2)

只有在尝试处理NaN和无穷大时才会收到一些警告。这在大约20秒内详尽测试了所有32位二进制浮点数。JavaScript使用64位浮点数,但考虑到它们受相同规则支配,它应该是等效的。

英文:

AFAICT, those expressions will always evaluate to the same values. This is given by the IEEE754 standard that specifies that the results of calculations should be the same as if the operations was carried out with infinite precision and then rounded to the nearest representable number.

As an empirical sanity check I ran the following Python code:

import numpy as np
a = np.arange(2**24, dtype=&#39;uint32&#39;)
for i in range(256):
    b = np.frombuffer(a + (i &lt;&lt; 24), &#39;float32&#39;)
    np.testing.assert_equal(b+b, b*2)

and only got a few warnings from when it tried to work with NaNs and infinities. This exhaustively tests all 32bit binary floats in ~20 seconds. Javascript uses 64bit floats, but given that they're governed by the same rules it should be equivalent.

huangapple
  • 本文由 发表于 2023年6月19日 17:47:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76505447.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定