英文:
What explains the range of a double in java?
问题
我正在学习编程的入门课程(使用Java语言),我对了解为什么Java中的原始数据类型范围如此设置很感兴趣。当涉及类似整数的数据类型(如字节)时,理解为什么它只接受从-128到127的值似乎很容易理解;一个字节的大小为8位,可以容纳多达256个值,因此我们可以为这256个值中的每一个分配一个自然数,我相信这就是幕后发生的事情。这也解释了为什么排除128:我们有128个负数,127个正数和零。这些已经加起来达到了256。[1,256]与[-128,128]之间没有双射关系。
这个想法在涉及所有类似整数的数据类型的范围时是有道理的,但是当涉及浮点数时,人们会看到一些奇怪的范围。例如,双精度浮点数的范围是[-2^{1074},(2-2^{-52})2^{1023}]。为什么会是这样呢?
英文:
I'm taking an introductory course in programming (in java) and I'm interested in understanding why the ranges of the primitive data types in java are as they are. When it comes to integer-like data types like a byte, it seems easy to understand why it only accepts values from -128 to 127; a byte has a size of 8 bits, which can take up to 256 values, so we can assign to each of these 256 values a natural number, which I believe is what is happening behind the curtain. It also explains why 128 is excluded: we have 128 negative numbers, 127 positives, and zero. These add up to 256 already. There is no bijection between [1,256] and [-128,128].
This idea makes sense in regards to the ranges of all the integer-like data types, but when it comes to floating-points, one sees some strange ranges. For example the range of a double is [-2^{1074},(2-2^{-52})2^{1023}]. Why is this the case?
I apologize for the cumbersome notation, apparently I can't use latex here.
答案1
得分: 1
Java使用IEEE-754 binary64格式。在这个格式中,一个比特表示一个符号(+或-,取决于比特位是0还是1),八个比特用于表示指数,52个比特用于主要的有效数字编码。53位有效数字中的一位通过指数进行编码。
指数字段的取值范围是从0到2047。2047被保留用于表示无穷大和NaN(非数值)。0用于表示次正规数。取值从1到2046用于表示正规数。在这个范围内,指数字段值为E表示指数e = E−1023。因此,e的最小值为1−1023 = −1022,最大值为2046−1023 = 1023。指数字段值为0也表示最小的E值,即−1022。
53位有效数字表示一个二进制数d.ddd…ddd<sub>2</sub>,其中如果指数字段为0,则第一个d为0,如果指数字段为1到2046,则为1。在“.”之后有52个比特位,它们由主要有效数字字段给出。
设S为符号位字段,E为指数字段的值,F为主要有效数字字段的值(作为整数)。那么所表示的值为:
- 如果E为1到2046,(−1)<sup>S</sup> • 2<sup>E−1023</sup> • (1 + F•2<sup>−52</sup>),
- 如果E为0,(−1)<sup>S</sup> • 2<sup>−1022</sup> • (0 + F•2<sup>−52</sup>)。
现在我们可以看到,当S为1,E为0,F为1(0000000000000000000000000000000000000000000000000001<sub>2</sub>)时,所表示的最小正数是:(−1)<sup>0</sup> • 2<sup>−1022</sup> • (0 + 1•2<sup>−52</sup>) = +1 • 2<sup>−1022</sup> • 2<sup>−52</sup> = 2<sup>−1074</sup>。
最大有限数是当S为1,E为2046,F为2<sup>52</sup>−1(1111111111111111111111111111111111111111111111111111<sub>2</sub>)时表示的:(−1)<sup>0</sup> • 2<sup>2046−1023</sup> • (1 + (2<sup>52</sup>−1)•2<sup>−52</sup> = +1 • 2<sup>1023</sup> • (1 + 1 − 2<sup>−52</sup>) = 2<sup>1023</sup> • (2<sup>1</sup> − 2<sup>−52</sup>) = 2<sup>1024</sup> − 2<sup>971</sup>。
英文:
Java uses the IEEE-754 binary64 format. In this format, one bit represents a sign (+ or − according to whether the bit is 0 or 1), eight bits are used for an exponent, and 52 bits are used for the primary encoding of the significant. One bit of the 53-bit significand is encoded via the exponent.
The values of the exponent field range from 0 to 2047. 2047 is reserved for use with infinities and NaNs. 0 is used for subnormal numbers. The values 1 to 2046 are used for normal numbers. In this range, an exponent field value of E represents an exponent e = E−1023. So the lowest value of e is 1−1023 = −1022, and the highest value is 2046−1023 = 1023. An exponent field value of 0 also represents the lowest value of E, −1022.
The 53-bit significand represents a binary numeral d.ddd…ddd<sub>2</sub>, where the first d is 0 if the exponent field is 0 and 1 if the exponent field is 1 to 2046. There are 52 bits after the “.”, and they are given by the primary significand field.
Let S be the sign bit field, E be the value of the exponent field, and F be the value of the primary significand field as an integer. Then the value represented is:
- if E is 1 to 2046, (−1)<sup>S</sup> • 2<sup>E−1023</sup> • (1 + F•2<sup>−52</sup>),
- if E is 0, (−1)<sup>S</sup> • 2<sup>−1022</sup> • (0 + F•2<sup>−52</sup>).
Now we can see the smallest positive number is represented when S is 1, E is 0, and F is 1 (0000000000000000000000000000000000000000000000000001<sub>2</sub>). Then the value represented is (−1)<sup>0</sup> • 2<sup>−1022</sup> • (0 + 1•2<sup>−52</sup>) = +1 • 2<sup>−1022</sup> • 2<sup>−52</sup> = 2<sup>−1074</sup>.
The greatest finite number is represented when S is 1, E is 2046, and F is 2<sup>52</sup>−1 (1111111111111111111111111111111111111111111111111111<sub>2</sub>). Then the value represented is (−1)<sup>0</sup> • 2<sup>2046−1023</sup> • (1 + (2<sup>52</sup>−1)•2<sup>−52</sup> = +1 • 2<sup>1023</sup> • (1 + 1 − 2<sup>−52</sup>) = 2<sup>1023</sup> • (2<sup>1</sup> − 2<sup>−52</sup>) = 2<sup>1024</sup> − 2<sup>971</sup>.
答案2
得分: 0
这对于浮点数和整数一样 - 可用的位数。
IEEE双精度浮点值有64位,使用如下:
符号位:1位
指数:11位
尾数精度:52位
尾数是一个二进制分数,最大值(所有位设置)因此略小于一。
指数的值范围是0到2047,或者-1024到+1023。这使得您获得了大约从2的-1024次方到2的+1023次方的范围(实际上要少一些,因为有一些值被保留供特定用途)。
英文:
It's the same thing for floating-point values as for integer values - the number of bits available.
An IEEE double-length floating point value has 64 bits, used as follows:
Sign bit: 1 bit
Exponent: 11 bits
Significand precision: 52 bits
The significand is a binary fraction, the maximum value (all bits set) is therefore somewhat less than one.
The exponent has value 0 to 2047, or -1024 to +1023. That gives you the approximate range of 2 to the -1024 to 2 to the +1023 (it's actually less since a couple of values are reserved for specific use).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论