如何阅读这个浮点数表示?

huangapple go评论64阅读模式
英文:

How do I read this floating point representation?

问题

我正在使用struct模块来获取浮点数的二进制表示。但是,我在理解输出方面遇到了一些困难。

import struct

def fl_bin(fl):
    s = struct.pack('!f', fl)
    b = ''.join(format(c, '08b') for c in s)
    print(b) # 整体表示
    print("Exponent: {}".format(b[1:9])) # 指数部分
    return b[9:] # 尾数部分

a = fl_bin(2.1)

调用结果显示指数为10000000,尾数为00001100110011001100110。

我的推理如下:表示的长度为32位,因此应该是单精度格式。第一个位表示符号,接下来的8位表示指数(其余部分表示尾数)。

然而,如果这是正确的,我不太明白如何从中获取原始数字。例如,对于指数部分,我看到:1 -> 正数,000000 -> 指数部分为1。但是,我对尾数部分的理解有些困难。也许有人可以详细解释这个问题!

英文:

I am using the struct module to obtain the binary representation of a floating point number. However, I have some trouble in making sense of the output

import struct
def fl_bin(fl):
    s = struct.pack('!f', fl)
    b = ''.join(format(c, '08b') for c in s)
    print(b) # whole representation
    print("Exponent: {}".format(b[1:9])) # Exponent only
    return b[9:] # significant

a = fl_bin(2.1)

The call gives an exponent of 10000000 and a significant of 00001100110011001100110.
My reasoning was as follows: The length of the representation is 32, hence it should be the single precision format. The first bit should indicate the sign, the next 8 bits the exponent (and the rest for the significant).
However, if this is true, I don't really see how to obtain the original number from this. For instance, for the exponent I read: 1 -> positive, 000000 -> 1 as exponent. But then I am stuck making sense of the significant. Maybe someone could elaborate on this issue!

答案1

得分: 1

浮点数表示使用IEEE 754标准,在这种情况下使用binary32格式。
阅读该规范,您可以了解到在二进制表示中有:

  • 符号位:1位
  • 指数位宽度:8位
  • 尾数精度:24位(显式存储的23位)

指数是有偏的,这意味着您必须减去127的偏差才能得到实际数字。这给我们一个指数为1的示例。

尾数只存储有分数部分,有一个隐式的1位前导它。因此,我们得到一个二进制数1.00001100110011001100110。将其插入bin -> dec转换器,我们得到一个值为1.0499999523162841796875

将所有这些放在一起,1.0499999523162841796875 * 2 ^ 1 = 2.099999904632568359375,这足够接近我们原来的2.1

英文:

Floating point representations use IEEE 754 standard, in this case binary32 format.
Reading specification for that, you can learn that in binary representation you have:

  • Sign bit: 1 bit
  • Exponent width: 8 bits
  • Significand precision: 24 bits (23 explicitly stored)

Exponent is biased, meaning you have to subtract bias of 127 to get the actual number. This gives us an exponent of 1 in your example.

The significant is stored with only a fractional part, with implicit 1 leading it. Thus, we get a binary number of 1.00001100110011001100110. Plugging that to bin -> dec converter, we get a value of 1.0499999523162841796875.

Taking all of those together, 1.0499999523162841796875 * 2 ^ 1 = 2.099999904632568359375 which is close enough to our original 2.1

答案2

得分: 1

解释指数字段时,将其值视为一个八位无符号二进制数,并减去127。

作为二进制数,“10000000”表示128。减去127得到1。所以“10000000”表示指数为1。

IEEE-754 binary32编码的完整解释:

  • S为符号字段中的位。 令E为指数字段中的八位,解释为无符号二进制数。 令F为主要尾数字段中的23位,解释为无符号二进制数。
  • NaN: 如果E为255(所有位都是1),且F不为0,则浮点数据是NaN。最好是,QNaN具有尾数字段的最高位设置为1,而信号NaN则未设置,但IEEE 754不要求这样。
  • 无穷大: 如果E为255,而F为0,则数据为(−1)<sup>S</sup>•∞,即如果S为0,则为+∞,如果S为1,则为−∞。
  • 正常值: 如果E既不是255也不是0,则数据为(−1)<sup>S</sup>•(1+F•2<sup>−23</sup>)•2<sup>E−127</sup>。换句话说,将F移到小数点右边(广义的小数点),在左边放置1,按指数(由偏差调整)进行缩放,然后应用符号。
  • 次正常值: 如果E为0且F不为0,则数据为(−1)<sup>S</sup>•(0+F•2<sup>−23</sup>)•2<sup>1−127</sup>。换句话说,将F移到小数点右边,在左边放置0,按照指数字段为1(而不是0)进行缩放,然后应用符号。请注意,由于指数被视为1,这意味着次正常值与最低正常值具有相同的缩放方式—通过将最高位从1更改为0而不是通过减少指数来降低。
  • 零: 如果E为0且F为0,则数据为(−1)<sup>S</sup>•0。请注意,IEEE 754区分+0和−0。(零也可以使用次正常数的公式来解释。它仅因其数学意义而被区分为特殊值。)
英文:

To interpret the exponent field, take its value as a eight-bit unsigned binary numeral and subtract 127.

As a binary numeral, “10000000” represents 128. Subtracting 127 yields 1. So “10000000” represents an exponent of 1.

Full interpretation of IEEE-754 binary32 encodings:

  • Let S be the bit in the sign field. Let E be the eight bits in the exponent field, interpreted as an unsigned binary numeral. Set F be the 23 bits in the primary significand field, interpreted as an unsigned binary numeral.
  • NaN: If E is 255 (all one bits) and F is not 0, the floating-point datum is a NaN. Preferably, a QNaN has the leading bit of the significand field set and a signaling NaN has it clear, but this is not required by IEEE 754.
  • Infinite: If E is 255 and F is 0, the datum is (−1)<sup>S</sup>•∞, that is, +∞ if S is 0 and −∞ if S is 1.
  • Normal: If E is neither 255 nor 0, the datum is (−1)<sup>S</sup>•(1+F•2<sup>−23</sup>)•2<sup>E−127</sup>. In other words, move F to the right of the radix point (a generalized decimal point), put 1 on the left, scale by the exponent (adjusted by the bias), and apply the sign.
  • Subnormal: If E is 0 and F is not 0, the datum is (−1)<sup>S</sup>•(0+F•2<sup>−23</sup>)•2<sup>1−127</sup>. In other words, move F to the right of the radix point, put 0 on the left, scale as if the exponent field were 1 (not 0), and apply the sign. Note that since the exponent is taken as 1, this means subnormals have the same scaling as the lowest normals—they decreased by changing the leading digit from 1 to 0 rather than by decreasing the exponent.
  • Zero: If E is 0 and F is 0, the datum is (−1)<sup>S</sup>•0. Note that IEEE 754 distinguishes +0 and −0. (Zero can also be interpreted using the formula for subnormals. It is merely distinguished as a special value due to its mathematical significance.)

huangapple
  • 本文由 发表于 2023年8月10日 19:14:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76875202.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定