英文:
What is the largest integer range from 0 to N-1 that can be represented by an IEEE 754 double-precision floating point value in the range [0, 1)?
问题
如果我们将双精度浮点范围 [0, 1)
映射到整数范围 [0-N)
,通过将浮点范围乘以整数值 N
,那么最大的数 N
是多少,以便在将 float_value * N
的结果截断/向下取整时,能表示所有的数 [0-N)
?不使用扩展精度乘法。N 被舍入为可表示的值。
英文:
If we map the double-precision floating-point range [0, 1)
to the integer range [0-N)
by multiplying the floating-point range by the integer value N
then what is the largest number N
such that all numbers [0-N)
are represented when truncating/flooring the result of float_value * N
to an integer? No extended precision multiplication. N is rounded to a representable value.
答案1
得分: 1
First, consider N = 253. For every integer k ∊ [0, 253), k/253 is representable in IEEE-754 double precision (binary64) as +k•2-53, since each number representable in binary64 may be described as a sign and a 53-bit integer multiplied by a power of 2, 2e, for an integer power e ∊ [−1074, +972). Each such k/253 is in [0, 1), and the floating-point multiplication of each such k/253 by N (253) incurs no rounding error, producing exactly k. Therefore, each integer in [0, N) is mapped to by at least one binary64 number in [0, 1).
N cannot be 253+1 since it is not representable (it requires 54 bits to represent, and the binary64 significand has only 53 bits). For larger N, the integer 253+1 is in [0, N) but cannot be the result of any binary64 multiplication because it is not representable.
Therefore 253 is the largest value for N that satisfies the constraints of the question.
英文:
First, consider N = 2<sup>53</sup>. For every integer k ∊ [0, 2<sup>53</sup>), k/2<sup>53</sup> is representable in IEEE-754 double precision (binary64) as +k•2<sup>−53</sup>, since each number representable in binary64 may be described as a sign and a 53-bit integer multiplied by a power of 2, 2<sup>e</sup>, for an integer power e ∊ [−1074, +972). Each such k/2<sup>53</sup> is in [0, 1), and the floating-point multiplication of each such k/2<sup>53</sup> by N (2<sup>53</sup>) incurs no rounding error, producing exactly k. Therefore, each integer in [0, N) is mapped to by at least one binary64 number in [0, 1).
N cannot be 2<sup>53</sup>+1 since it is not representable (it requires 54 bits to represent, and the binary64 significand has only 53 bits). For larger N, the integer 2<sup>53</sup>+1 is in [0, N) but cannot be the result of any binary64 multiplication because it is not representable.
Therefore 2<sup>53</sup> is the largest value for N that satisfies the constraints of the question.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论