
huangapple go评论103阅读模式

Go: How to check precision loss when converting float64 to float32




  1. func convert(input float64) (output float32, err error) {
  2. const tolerance = 0.001
  3. output = float32(input)
  4. if output > input+tolerance || output < input-tolerance {
  5. return 0, errors.New("lost too much precision")
  6. }
  7. return output, nil
  8. }



I have a scenario where I receive a float64 value, but must send it down the wire to another service as a float32 value. We know the received value should always fit into a float32. However, to be safe I want to log the case where we are losing data by converting to float32.

This code block does not compile, since you can't compare float32 to float64 directly.

  1. func convert(input float64) (output float32, err error) {
  2. const tolerance = 0.001
  3. output = float32(input)
  4. if output &gt; input+tolerance || output &lt; input-tolerance {
  5. return 0, errors.New(&quot;lost too much precision&quot;)
  6. }
  7. return output, nil
  8. }

Is there an easy way to check that I am hitting this condition? This check will happen at high frequency, so I want to avoid doing string conversions.


得分: 4


要检查转换后的值是否表示相同的值,只需将其与原始值(输入)进行比较。只返回一个ok bool信息(而不是一个error)也足够/符合惯例:

  1. func convert(input float64) (output float32, ok bool) {
  2. output = float32(input)
  3. ok = float64(output) == input
  4. return
  5. }



  1. fmt.Println(convert(1))
  2. fmt.Println(convert(1.5))
  3. fmt.Println(convert(0.123456789))
  4. fmt.Println(convert(math.MaxFloat32))

输出结果(在Go Playground上尝试):

  1. 1 true
  2. 1.5 true
  3. 0.12345679 false
  4. 3.4028235e+38 true

请注意,由于float32的精度小于float64,因此这通常会导致ok = false的结果,即使转换后的值可能非常接近输入值。



  1. func convert(input float64) (output float32, ok bool) {
  2. const maxRelDiff = 1e-8
  3. output = float32(input)
  4. diff := math.Abs(float64(output) - input)
  5. ok = diff <= math.Abs(input)*maxRelDiff
  6. return
  7. }


  1. fmt.Println(convert(1))
  2. fmt.Println(convert(1.5))
  3. fmt.Println(convert(1e20))
  4. fmt.Println(convert(math.Pi))
  5. fmt.Println(convert(0.123456789))
  6. fmt.Println(convert(math.MaxFloat32))

输出结果(在Go Playground上尝试):

  1. 1 true
  2. 1.5 true
  3. 1e+20 false
  4. 3.1415927 false
  5. 0.12345679 false
  6. 3.4028235e+38 true

You can convert back the float32 value to float64, just for the validation.

To check if the converted value represents the same value, simply compare it to the original value (the input). It's also enough / idiomatic to just return an ok bool info (instead of an error):

  1. func convert(input float64) (output float32, ok bool) {
  2. output = float32(input)
  3. ok = float64(output) == input
  4. return
  5. }

(Note: edge cases like NaN are not checked.)

Testing it:

  1. fmt.Println(convert(1))
  2. fmt.Println(convert(1.5))
  3. fmt.Println(convert(0.123456789))
  4. fmt.Println(convert(math.MaxFloat32))

Output (try it on the Go Playground):

  1. 1 true
  2. 1.5 true
  3. 0.12345679 false
  4. 3.4028235e+38 true

Note that this will often give ok = false result because the precision of float32 is less than that of float64, even though the converted value may be very close to the input.

So in practice it would be more useful to check the difference of the converted value. Your proposed solution checks for the absolute difference value which is not so useful: for example 1000000.1 and 1000000 are very close numbers, even though the difference is 0.1. 0.0001 and 0.00011 have much less difference: 0.00001, yet the difference compared to the numbers is much bigger.

So you should check the relative difference, for example:

  1. func convert(input float64) (output float32, ok bool) {
  2. const maxRelDiff = 1e-8
  3. output = float32(input)
  4. diff := math.Abs(float64(output) - input)
  5. ok = diff &lt;= math.Abs(input)*maxRelDiff
  6. return
  7. }

Testing it:

  1. fmt.Println(convert(1))
  2. fmt.Println(convert(1.5))
  3. fmt.Println(convert(1e20))
  4. fmt.Println(convert(math.Pi))
  5. fmt.Println(convert(0.123456789))
  6. fmt.Println(convert(math.MaxFloat32))

Output (try it on the Go Playground):

  1. 1 true
  2. 1.5 true
  3. 1e+20 false
  4. 3.1415927 false
  5. 0.12345679 false
  6. 3.4028235e+38 true


得分: 0



Yes. Check that the value does not exceed the upper or lower value limit. Then ensure the 52 - 23 least significant bits are 0. (in a nutshell)

  • 本文由 发表于 2023年2月3日 06:30:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75329580.html



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
