How to keep precision for big numbers in golang when converting from float to big.Int

huangapple go评论84阅读模式
英文:

How to keep precision for big numbers in golang when converting from float to big.Int

问题

我有一个可能是非常大或非常小的浮点数输入,需要将其转换为big.Int,但由于某种原因,存在一些精度损失。
我知道对于非常小的数字,这种情况应该会发生,但为什么对于大数也会发生,并且如何避免这种情况?

链接:https://go.dev/play/p/AySnKAikSRx

英文:

I have an input that could be a very big or a very small float and need to convert it to big.Int, but for some reason, there is some precision loss.
I understand that this should happen for very small numbers, but why does it happen for a big number, and how to avoid it?

https://go.dev/play/p/AySnKAikSRx

答案1

得分: 2

所有小于9007199254740992的正整数都可以用float64表示,而不会丢失精度。超过这个范围,就有可能丢失精度,这就是你的情况。

为了给你一个基本的概念,假设我们正在发明一种非常紧凑的方案来表示浮点数,使用以下公式:

m.mm * 10^+-e

其中:

  • e = 指数,[1-9]
  • m.mm = 小数部分 [0.01-9.99]

通过这个公式,我们可以确定可以表示的值的范围:

  • 最小值 = 0.01 * 10^-9 = 0.00000000001
  • 最大值 = 9.99 * 10^9 = 9990000000

所以这是一个相当不错的数字范围。

我们可以表示很多正整数而不会有任何困难,例如:

1   = 1.00 * 10^0
2   = 2.00 * 10^0
3   = 3.00 * 10^0
⋮
10  = 1.00 * 10^1
11  = 1.10 * 10^1
12  = 1.20 * 10^1
⋮
100 = 1.00 * 10^2
101 = 1.01 * 10^2
102 = 1.02 * 10^2
⋮
999 = 9.99 * 10^2

当我们超过9.99 * 10^2时,问题就开始了。表示1000不是问题:

1000 = 1.00 * 10^3

但是如何表示1001呢?下一个可能的值是

1.01 * 10^3 = 1010

这就是+9的精度损失,所以我们只能选择1.00 * 10^3,精度损失为-1。

以上基本上就是float64的工作原理,只不过是在基数为2的情况下,并且使用了52位的尾数。当所有52位都设置为1,然后再加1时,得到的值是:

1.0 * 2^53 = 9007199254740992

因此,所有小于这个值的正整数都可以在不丢失精度的情况下表示。超过这个值的整数可能会丢失精度,这取决于具体的值。

现在,你Go代码中引用的值是:

var x float64 = 827273999999999954

没有办法将这个确切的值表示为float64。

package main

import (
	"fmt"
)

func main() {
	var x float64 = 827273999999999954

	fmt.Printf("%f\n", x)
}

输出结果是:

827274000000000000.000000

所以在初始化x时已经丢失了精度。但是这是在什么时候发生的呢?如果我们运行:

$ go build -o tmp
$ go tool objdump tmp

然后搜索TEXT main.main(SB),我们可以找到指令:

main.go:10            0x108b654               48b840d5cba322f6a643    MOVQ $0x43a6f622a3cbd540, AX

所以0x43a6f622a3cbd540被设置到AX寄存器中,这就是我们的float64值。

package main

import (
	"fmt"
	"math"
)

func main() {
	fmt.Printf("float: %f\n", math.Float64frombits(0x43a6f622a3cbd540))
}

输出结果是:

float: 827274000000000000.000000

所以精度在编译时已经丢失了(这是有道理的)。所以在big.NewFloat(x).Int(nil)这行代码中,作为x传递的值是827274000000000000.000000

如何避免这个问题?

根据你提供的代码,没有办法避免这个问题。

如果你能将该值表示为整数:

package main

import (
	"fmt"
	"math/big"
)

func main() {
	var x uint64 = 827273999999999954

	bf := (&big.Float{}).SetUint64(x)
	
	fmt.Println(bf)
}

输出结果是:

8.27273999999999954e+17

这就是你期望的值。或者你也可以使用字符串:

package main

import (
	"fmt"
	"math/big"
)

func main() {
	var x string = "827273999999999954"

	bf, ok := (&big.Float{}).SetString(x)
	if !ok {
		panic("failed to set string")
	}

	fmt.Println(bf)
}

输出结果也是:

8.27273999999999954e+17
英文:

All positive integers up to 9007199254740992 can be represented in a float64 without any loss of precision. Anything higher, you run the risk of precision loss, which is happening in your case.


To give a basic idea of why..

Say we're inventing an extremely compact scheme for representing floating point numbers using the following formula:

m.mm * 10^+-e

.. where:

  • e = exponent, [1-9]
  • m.mm = mantissa [0.01-9.99]

With this, we can figure out what range of values can be represented:

  • lowest = 0.01 * 10^-9 = 0.00000000001
  • highest = 9.99 * 10^9 = 9990000000

So that's a pretty decent range of numbers.

We can represent a fair few positive integers without any difficulty, e.g.

1   = 1.00 * 10^0
2   = 2.00 * 10^0
3   = 3.00 * 10^0
⋮
10  = 1.00 * 10^1
11  = 1.10 * 10^1
12  = 1.20 * 10^1
⋮
100 = 1.00 * 10^2
101 = 1.01 * 10^2
102 = 1.02 * 10^2
⋮
999 = 9.99 * 10^2

The problem starts when we exceed 9.99 * 10^2. It's not an issue to represent 1000:

1000 = 1.00 * 10^3

But how do represent 1001? The next possible value is

1.01 * 10^3 = 1010

Which is +9 loss of precision, so we have to settle on 1.00 * 10^3 with -1 loss of precision.

The above is in essence how this plays out with float64, except in base 2 and with a 52 bit mantissa in play. With all 52 bits set, and then adding one, the value is:

1.0 * 2^53 = 9007199254740992

So all positive integers up to this value can be represented without precision loss. Integers higher than this may incur precision loss - it very much depends on the value.


Now, the value referenced in your Go code:

var x float64 = 827273999999999954

There is no way to represent this exact value as a float64.

package main

import (
	"fmt"
)

func main() {
	var x float64 = 827273999999999954

	fmt.Printf("%f\n", x)
}

yields..

827274000000000000.000000

So essentially precision is lost by the time x is initialized. But when does that occur? If we run..

$ go build -o tmp
$ go tool objdump tmp

And search for TEXT main.main(SB), we can find the instruction:

main.go:10            0x108b654               48b840d5cba322f6a643    MOVQ $0x43a6f622a3cbd540, AX

So 0x43a6f622a3cbd540 is being set into AX - this is our float64 value.

package main

import (
	"fmt"
	"math"
)

func main() {
	fmt.Printf("float: %f\n", math.Float64frombits(0x43a6f622a3cbd540))
}

prints

float: 827274000000000000.000000

So the precision has essentially been lost at compile time (which makes sense). So on the line of code with big.NewFloat(x).Int(nil), the value being passed as x is 827274000000000000.000000


> how to avoid it?

With the code you've provided, there is no way.

If you're able to represent the value as an integer..

package main

import (
	"fmt"
	"math/big"
)

func main() {
	var x uint64 = 827273999999999954

	bf := (&big.Float{}).SetUint64(x)
	
	fmt.Println(bf)
}

yields

8.27273999999999954e+17

which is the value you're expecting. Or alternatively via a string:

package main

import (
	"fmt"
	"math/big"
)

func main() {
	var x string = "827273999999999954"

	bf, ok := (&big.Float{}).SetString(x)
	if !ok {
		panic("failed to set string")
	}

	fmt.Println(bf)
}

huangapple
  • 本文由 发表于 2022年10月22日 21:54:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/74164159.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定