How to change a float64 number to uint64 in a right way?

huangapple go评论82阅读模式
英文:

How to change a float64 number to uint64 in a right way?

问题

package main

func main() {
    var n float64 = 6161047830682206209
    println(uint64(n))
}

输出结果为:

6161047830682206208

看起来当 float64 转换为 uint64 时,小数部分会被舍弃。

英文:
package main

func main() {
	var n float64 = 6161047830682206209
	println(uint64(n))
}

The output will be:

6161047830682206208

It looks like that when float64 change to uint64, the fraction is discarded.

答案1

得分: 18

这里的问题是常量和浮点数的表示方式。

常量使用任意精度进行表示。浮点数使用IEEE 754标准进行表示。

规范:常量:

数值常量表示任意精度的值,不会溢出。

规范:数值类型:

float64:表示所有IEEE 754 64位浮点数的集合

在IEEE 754中,双精度浮点数使用64位(Go中的float64)中的53位来存储数字。这意味着可以表示的最大数字是2<<52的位数,即(1位用于符号):

2<<52        : 9007199254740992
您的常量:6161047830682206209

精确到15.95位(16位数字,但只能描述到9007199254740992)。

您尝试将整数常量放入float64类型的变量中,但它无法适应52位,因此必须四舍五入并且会丢失数字(或位)。

您可以通过打印原始的n float64数字来验证这一点:

var n float64 = 6161047830682206209
fmt.Printf("%f\n", n)
fmt.Printf("%d\n", uint64(n))

输出:

6161047830682206208.000000
6161047830682206208

问题不在于转换,问题在于您尝试转换的float64值已经不等于您尝试分配给它的常量。

只是出于好奇:

尝试使用一个比第一个常量大得多的数字:与第一个常量相比增加了500:

n = 6161047830682206709 // 与第一个常量相比增加了500!
fmt.Printf("%f\n", n2)
fmt.Printf("%d\n", uint64(n2))

输出仍然相同(最后的数字/位被截断,包括增加的500!):

6161047830682206208.000000
6161047830682206208

尝试一个较小的数字,其位数可以使用52位精确表示(小于约16位):

n = 7830682206209
fmt.Printf("%f\n", n)
fmt.Printf("%d\n", uint64(n))

输出:

7830682206209.000000
7830682206209

Go Playground上尝试一下。

英文:

The problem here is the representation of constants and floating point numbers.

Constants are represented in arbitrary precision. Floating point numbers are represented using the IEEE 754 standard.

Spec: Constants:

> Numeric constants represent values of arbitrary precision and do not overflow.

Spec: Numeric types:

> float64 the set of all IEEE-754 64-bit floating-point numbers

In IEEE 754 the double precision which is using 64 bits (float64 in Go) 53 bits are used to store the digits. This means the max digits (max number) that can be represented is the number of digits of 2&lt;&lt;52 which is (1 bit is for sign):

2&lt;&lt;52        : 9007199254740992
Your constant: 6161047830682206209

15.95 digits to be precise (16 digits, but not all values that you can describe with 16 digits, only up to 9007199254740992).

The integer constant you try to put into a variable of type float64 simply does not fit into 52 bits so it has to be rounded and digits (or bits) will be cut off (lost).

You can verify this by printing the original n float64 number:

var n float64 = 6161047830682206209
fmt.Printf(&quot;%f\n&quot;, n)
fmt.Printf(&quot;%d\n&quot;, uint64(n))

Output:

6161047830682206208.000000
6161047830682206208

The problem is not with conversion, the problem is that the float64 value you try to convert is already not equal to the constant you tried to assign to it.

Just for curiosity:

Try the same with a much bigger number: +500 compared to the first const:

n = 6161047830682206709 // +500 compared to first!
fmt.Printf(&quot;%f\n&quot;, n2)
fmt.Printf(&quot;%d\n&quot;, uint64(n2))

Output still the same (the last digits / bits are cut off, +500 included!):

6161047830682206208.000000
6161047830682206208

Try a smaller number whose digits can be represented precisely using 52 bits (less than ~16 digits):

n = 7830682206209
fmt.Printf(&quot;%f\n&quot;, n)
fmt.Printf(&quot;%d\n&quot;, uint64(n))

Output:

7830682206209.000000
7830682206209

Try it on the Go Playground.

答案2

得分: 7

问题不在于转换,而是数字太大,无法以“float64”精确存储。对“n”的赋值会导致一些精度损失。

如果你仔细思考一下,你的数字(约为6e18)非常接近最大的“uint64”(2^64-1,约为18e18)。一个“uint64”使用其全部64位来存储一个整数,但一个“float64”必须使用一些64位来存储指数,所以它将有更少的位来记住尾数。

总之,如果你将一个大于约10^15的“uint64”(或整数常量)赋给一个“float64”,然后再转回“uint64”,你将得到一个接近但可能不完全相同的值。

[顺便说一句,Icza的答案很好也是正确的。希望这个回答是一个简单的总结。]

英文:

The problem is not with the conversion but that the number is too large an integer to be stored exactly as a float64. The assignment to n results in some loss of significance.

If you think about it, your number (about 6e18) is very close to the largest uint64 (2^64-1 or about 18e18). A uint64 uses all of it's 64 bits to store an integer but a float64 must use some of its 64 bits to store an exponent so it's going to have less bits to remember the mantissa.

In summary, if you assign a uint64 (or integer constant) larger than about 10^15 to a float64 and back to a uint64 you will get a close, but probably not exactly the same, value.

[BTW Icza's answer is good and correct. I hope this answer is simple summary.]

huangapple
  • 本文由 发表于 2015年6月18日 00:25:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/30897208.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定