英文:
What does "%b" do in fmt.Printf for float64 and what is Min subnormal positive double in float64 in binary format?
问题
4503599627370496p-52
是一个浮点数,它使用十六进制表示法,并具有指数部分。在这个表示法中,p
后面的数字表示指数的值,而前面的数字表示尾数的值。具体来说,4503599627370496
是尾数的十六进制表示,-52
是指数的值。这种表示法常用于计算机中处理非常大或非常小的浮点数。
英文:
Go doc for Package fmt Floating-point and complex constituents
says:
> Floating-point and complex constituents:
%b decimalless scientific notation with exponent a power of two,
in the manner of strconv.FormatFloat with the 'b' format,
e.g. -123456p-78
Code:
fmt.Printf("0b%b\n", 255) // 0b11111111
fmt.Printf("%b\n", 1.0) // 4503599627370496p-52
What is 4503599627370496p-52
?
答案1
得分: 4
我做了一些研究,并在IEEE 754二进制表示法方面进行了多个小时的研究:
一个好的起点是:
https://en.wikipedia.org/wiki/Double-precision_floating-point_format
https://en.wikipedia.org/wiki/IEEE_floating_point
结果:
package main
import (
"fmt"
"math"
"strconv"
)
func main() {
fmt.Printf("0b%b\n", 255) //0b11111111
fmt.Printf("%b\n", 1.0) //4503599627370496p-52
fmt.Printf("%#X\n", 4503599627370496) //0X10000000000000
//float64: 1.0 = binary: 0X3FF0000000000000
//so 4503599627370496*2**-52 =("1"+"Fraction")*2**-52
//=0X10 0000 0000 0000*2**-52= 2**52 * 2**-52 = 2**0 = 1= significand
//2**52=0x10 0000 0000 0000
//1bit=sign 11bit=exponent-biased 52bit)=64 bit:
fmt.Printf("%#X\n", math.Float64bits(1.0)) //1.0=0X3FF0000000000000
//Exp: 0x3FF=1023=E(0) :bias=1023 Emin=1 Emax=2046 Exp=pow(2,0x3FF - 1023)=pow(2,0)=1
//significant: 1.mantisa (53bit nice!) 1.0000000000000
//1.0000000000000002, the smallest number > 1
fmt.Printf("%#X\n", math.Float64bits(1.0000000000000002)) //0X3FF0000000000001
// 1.0000000000000004, the next numer after 1.0000000000000002
fmt.Printf("%#X\n", math.Float64bits(1.0000000000000004)) //0X3FF0000000000002
fmt.Printf("%#X\n", math.Float64bits(2.0)) //0X4000000000000000
fmt.Printf("%#X\n", math.Float64bits(-2.0)) //0XC000000000000000
// Min subnormal positive double
fmt.Printf("%v\n", math.Float64frombits(1)) //5e-324
fmt.Printf("%#X\n", math.Float64bits(5e-324)) //0X0000000000000001
//Exp(2,-1022-52)=Exp(2,-1074)=5e-324
//Max subnormal double
fmt.Printf("%v\n", math.Float64frombits(0x000fffffffffffff)) //2.225073858507201e-308
fmt.Printf("%v\n", math.Float64frombits(0X0000000000000000)) //0
fmt.Printf("%v\n", math.Float64frombits(0X8000000000000000)) //-0
fmt.Printf("%v\n", math.Float64frombits(0X7FF0000000000000)) //+Inf
fmt.Printf("%v\n", math.Float64frombits(0XFFF0000000000000)) //-Inf
fmt.Printf("%v\n", math.Float64frombits(0x7fffffffffffffff)) //NaN
fmt.Printf("%#X\n%[1]b\n", math.Float64bits(0.1)) //0X3FB 999999999999A
//0 1111111011 1001100110011001100110011001100110011001100110011010
fmt.Printf("%#X\n%[1]b\n", math.Float64bits(0.2)) //0X3FC999999999999A
//11111111001001100110011001100110011001100110011001100110011010
fmt.Printf("%#X\n%[1]b\n", math.Float64bits(0.3)) //0X3FD3333333333333
//11111111010011001100110011001100110011001100110011001100110011
fmt.Println(1.0 / 3.0) //0.3333333333333333
//By default, 1/3 rounds down, instead of up like single precision,
//because of the odd number of bits in the significand
fmt.Printf("%#X\n%[1]b\n", math.Float64bits(1.0/3.0)) //0X3FD5555555555555
//11111111010101010101010101010101010101010101010101010101010101
/*
Given the hexadecimal representation 3FD5 5555 5555 555516,
Sign = 0
Exponent = 0x3FD = 1021
Exponent Bias = 1023 (constant value)
Fraction = 5 5555 5555 555516
Value = 2(Exponent − Exponent Bias) × 1.Fraction // Note that Fraction must not be converted to decimal here
= 2**−2 × (15 5555 5555 555516 × 2**−52)
= 2**−54 × 15 5555 5555 555516
= 0.333333333333333314829616256247390992939472198486328125
≈ 1/3
*/
var f float64 = 0.1
var bits uint64 = math.Float64bits(f) //IEEE 754 binary representation of f
fmt.Printf("%#X %[1]b\n", bits)
//0X3FB999999999999A 11111110111001100110011001100110011001100110011001100110011010
fmt.Printf("%b\n", f) //7205759403792794p-56
fmt.Printf("%b %b\n", 7205759403792794, -56)
//11001100110011001100110011001100110011001100110011010 111000
fmt.Println(len("11001100110011001100110011001100110011001100110011010"))
//text search in this text=> 53 bit right side
// 1 11111110101 100011110110001111100111100101011000111010000110011
fmt.Printf("-: %b\n", math.Float64bits(-0.1e+308)) // so left bit is sign bit
//1 111111110101 100011110110001111100111100101011000111010000110011
fmt.Printf("exp: %b\n", math.Float64bits(+0.1e-308))
//1 01110000001 010101110010011010001111110110101111
// 11Exponent bits
// 12345678901
i, err := strconv.ParseInt("11111110101", 2, 64) //2037
fmt.Println("E", i-1023) //1014
fmt.Printf("%b\n", 0.2) //7205759403792794p-55
fmt.Printf("%b\n", 0.3) //5404319552844595p-54
n, err := fmt.Printf("%b %b\n", 1.0, math.Float64bits(1.0))
//4503599627370496p-52 11111111110000000000000000000000000000000000000000000000000000
fmt.Println(n, err) //84 <nil>
//no err
fmt.Printf("'%[1]*.[2]*[3]f'\n", 12, 4, 1234.1234) //' 1234.1234'
}
Conclusion:
`%b` 对于 `float64` 只显示 **significand**,完成。
<details>
<summary>英文:</summary>
I did some research and after many hours of research with IEEE 754 binary representation:
A good point to start is:
https://en.wikipedia.org/wiki/Double-precision_floating-point_format
https://en.wikipedia.org/wiki/IEEE_floating_point
Results:
package main
import (
"fmt"
"math"
"strconv"
)
func main() {
fmt.Printf("0b%b\n", 255) //0b11111111
fmt.Printf("%b\n", 1.0) //4503599627370496p-52
fmt.Printf("%#X\n", 4503599627370496) //0X10000000000000
//float64: 1.0 = binary: 0X3FF0000000000000
//so 4503599627370496*2**-52 =("1"+"Fraction")*2**-52
//=0X10 0000 0000 0000*2**-52= 2**52 * 2**-52 = 2**0 = 1= significand
//2**52=0x10 0000 0000 0000
//1bit=sign 11bit=exponent-biased 52bit)=64 bit:
fmt.Printf("%#X\n", math.Float64bits(1.0)) //1.0=0X3FF0000000000000
//Exp: 0x3FF=1023=E(0) :bias=1023 Emin=1 Emax=2046 Exp=pow(2,0x3FF - 1023)=pow(2,0)=1
//significant: 1.mantisa (53bit nice!) 1.0000000000000
//1.0000000000000002, the smallest number > 1
fmt.Printf("%#X\n", math.Float64bits(1.0000000000000002)) //0X3FF0000000000001
// 1.0000000000000004, the next numer after 1.0000000000000002
fmt.Printf("%#X\n", math.Float64bits(1.0000000000000004)) //0X3FF0000000000002
fmt.Printf("%#X\n", math.Float64bits(2.0)) //0X4000000000000000
fmt.Printf("%#X\n", math.Float64bits(-2.0)) //0XC000000000000000
// Min subnormal positive double
fmt.Printf("%v\n", math.Float64frombits(1)) //5e-324
fmt.Printf("%#X\n", math.Float64bits(5e-324)) //0X0000000000000001
//Exp(2,-1022-52)=Exp(2,-1074)=5e-324
//Max subnormal double
fmt.Printf("%v\n", math.Float64frombits(0x000fffffffffffff)) //2.225073858507201e-308
fmt.Printf("%v\n", math.Float64frombits(0X0000000000000000)) //0
fmt.Printf("%v\n", math.Float64frombits(0X8000000000000000)) //-0
fmt.Printf("%v\n", math.Float64frombits(0X7FF0000000000000)) //+Inf
fmt.Printf("%v\n", math.Float64frombits(0XFFF0000000000000)) //-Inf
fmt.Printf("%v\n", math.Float64frombits(0x7fffffffffffffff)) //NaN
fmt.Printf("%#X\n%[1]b\n", math.Float64bits(0.1)) //0X3FB 999999999999A
//0 1111111011 1001100110011001100110011001100110011001100110011010
fmt.Printf("%#X\n%[1]b\n", math.Float64bits(0.2)) //0X3FC999999999999A
//11111111001001100110011001100110011001100110011001100110011010
fmt.Printf("%#X\n%[1]b\n", math.Float64bits(0.3)) //0X3FD3333333333333
//11111111010011001100110011001100110011001100110011001100110011
fmt.Println(1.0 / 3.0) //0.3333333333333333
//By default, 1/3 rounds down, instead of up like single precision,
//because of the odd number of bits in the significand
fmt.Printf("%#X\n%[1]b\n", math.Float64bits(1.0/3.0)) //0X3FD5555555555555
//11111111010101010101010101010101010101010101010101010101010101
/*
Given the hexadecimal representation 3FD5 5555 5555 555516,
Sign = 0
Exponent = 0x3FD = 1021
Exponent Bias = 1023 (constant value)
Fraction = 5 5555 5555 555516
Value = 2(Exponent − Exponent Bias) × 1.Fraction // Note that Fraction must not be converted to decimal here
= 2**−2 × (15 5555 5555 555516 × 2**−52)
= 2**−54 × 15 5555 5555 555516
= 0.333333333333333314829616256247390992939472198486328125
≈ 1/3
*/
var f float64 = 0.1
var bits uint64 = math.Float64bits(f) //IEEE 754 binary representation of f
fmt.Printf("%#X %[1]b\n", bits)
//0X3FB999999999999A 11111110111001100110011001100110011001100110011001100110011010
fmt.Printf("%b\n", f) //7205759403792794p-56
fmt.Printf("%b %b\n", 7205759403792794, -56)
//11001100110011001100110011001100110011001100110011010 111000
fmt.Println(len("11001100110011001100110011001100110011001100110011010"))
//text search in this text=> 53 bit right side
// 1 11111110101 100011110110001111100111100101011000111010000110011
fmt.Printf("-: %b\n", math.Float64bits(-0.1e+308)) // so left bit is sign bit
//1 111111110101 100011110110001111100111100101011000111010000110011
fmt.Printf("exp: %b\n", math.Float64bits(+0.1e-308))
//1 01110000001 010101110010011010001111110110101111
// 11Exponent bits
// 12345678901
i, err := strconv.ParseInt("11111110101", 2, 64) //2037
fmt.Println("E", i-1023) //1014
fmt.Printf("%b\n", 0.2) //7205759403792794p-55
fmt.Printf("%b\n", 0.3) //5404319552844595p-54
n, err := fmt.Printf("%b %b\n", 1.0, math.Float64bits(1.0))
//4503599627370496p-52 11111111110000000000000000000000000000000000000000000000000000
fmt.Println(n, err) //84 <nil>
//no err
fmt.Printf("'%[1]*.[2]*[3]f'\n", 12, 4, 1234.1234) //' 1234.1234'
}
Conclusion:
`%b` for `float64` shows only **significand**, done.
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论