在Go语言程序中,浮点数算术可能存在不一致的情况。

huangapple go评论72阅读模式
英文:

Float Arithmetic inconsistent between golang programs

问题

使用pion/opus解码音频文件时,偶尔会得到错误的值。

我已经调试了以下代码。当这个例程在Opus解码器内部运行时,得到的值与在外部运行时不同。当这两个浮点数相加时,最右边的位不同。随着程序运行时间的增长,这些值的差异最终会成为一个问题。

这是一个错误还是预期行为?我不知道如何更深入地调试这个问题或者转储程序状态以了解更多信息。

在解码器外部:

package main

import (
    "fmt"
    "math"
)

func main() {
    a := math.Float32frombits(uint32(955684399))
    b := math.Float32frombits(uint32(927295728))

    fmt.Printf("%b\n", math.Float32bits(a))
    fmt.Printf("%b\n", math.Float32bits(b))
    fmt.Printf("%b\n", math.Float32bits(a+b))
}

返回结果:

111000111101101001011000101111
110111010001010110100011110000
111001000001111010000110100110

然后在解码器内部:

fmt.Printf("%b\n", math.Float32bits(lpcVal))
fmt.Printf("%b\n", math.Float32bits(val))
fmt.Printf("%b\n", math.Float32bits(lpcVal+val))

返回结果:

111000111101101001011000101111
110111010001010110100011110000
111001000001111010000110100111
英文:

When decoding audio files with pion/opus I will occasionally get values that are incorrect.

I have debugged it down to the following code. When this routine runs inside the Opus decoder I get a different value then when I run it outside? When the two floats are added together the right most bit is different. The difference in values eventually becomes a problem as the program runs longer.

Is this a bug or expected behavior? I don't know how to debug this deeper/dump state of my program to understand more.

Outside decoder

package main

import (
    "fmt"
    "math"
)

func main() {
    a := math.Float32frombits(uint32(955684399))
    b := math.Float32frombits(uint32(927295728))

    fmt.Printf("%b\n", math.Float32bits(a))
    fmt.Printf("%b\n", math.Float32bits(b))
    fmt.Printf("%b\n", math.Float32bits(a+b))
}

Returns

111000111101101001011000101111
110111010001010110100011110000
111001000001111010000110100110

Then Inside decoder

    fmt.Printf("%b\n", math.Float32bits(lpcVal))
    fmt.Printf("%b\n", math.Float32bits(val))
    fmt.Printf("%b\n", math.Float32bits(lpcVal+val))

Returns

111000111101101001011000101111
110111010001010110100011110000
111001000001111010000110100111

答案1

得分: 1

我猜lpcvalval不是Float32类型,而是Float64类型。

如果是这样的话,你提出了两种不同的操作:

  • 在前一种情况下,你执行Float32bits(lpcval) + Float32bits(val)
  • 在后一种情况下,你执行Float32bits(lpcval + val)

这两个32位浮点数的二进制表示如下:

1.11101101001011000101111 * 2^-14
1.10001010110100011110000 * 2^-17

精确的和是

1.000011110100001101001101 * 2^-13

这是两个可表示的Float32之间的精确平衡点;
结果被舍入为具有偶数尾数的Float32

1.00001111010000110100110 * 2^-13

但是lpcvalval是Float64类型:它们的小数点后面不是23位,而是52位(多了19位)。

如果这19位中的任何一位不为零,结果可能不是一个精确的平衡点,而是略大于精确的平衡点;
一旦转换为最接近的Float32,结果将是

1.00001111010000110100111 * 2^-13

由于我们不知道lpcvalval在这些低有效位中包含什么,所以任何事情都可能发生,即使没有使用fma操作。

英文:

I guess that lpcval and val are not Float32 but rather Float64.

If that is the case, then you are proposing two different operations:

  • in the former case, you do Float32bits(lpcval) + Float32bits(val)
  • in the later case, you do Float32bits(lpcval + val)

the two 32 bits floats are in binary:

1.11101101001011000101111 * 2^-14
1.10001010110100011110000 * 2^-17

The exact sum is

1.000011110100001101001101 * 2^-13

which is an exact tie between two representable Float32<br>
the result is rounded to the Float32 with even significand

1.00001111010000110100110 * 2^-13

But lpcval and val are Float64: instead of 23 bits after the floating point, they have 52 (19 more).

If a single bit among those 19 more bits is different from zero, the result might not be an exact tie, but slightly larger than the exact tie.<br>
Once converted to nearest Float32, that will be

1.00001111010000110100111 * 2^-13

Since we have no idea of what lpcval and val contains in those low significant bits, anything can happen, even without the use of fma operations.

答案2

得分: 0

这是由于“融合乘加”(Fused multiply and add)引起的。多个浮点运算被合并为一次操作。

你可以在Go 语言规范#浮点运算符中了解更多信息。

我对代码进行的更改是:

 - lpcVal += currentLPCVal * (aQ12 / 4096.0)
 + lpcVal = float32(lpcVal) + float32(currentLPCVal)*float32(aQ12)/float32(4096.0)

感谢 Bryan C. Mills 在 Gophers Slack 的 #performance 频道上解答此问题。

英文:

This was happening because of Fused multiply and add. Multiple floating point operations were becoming combined into one operation.

You can read more about it in the Go Language Spec#Floating_Point_Operators

The change I made to my code was

 - lpcVal += currentLPCVal * (aQ12 / 4096.0)
 + lpcVal = float32(lpcVal) + float32(currentLPCVal)*float32(aQ12)/float32(4096.0)

Thank you to Bryan C. Mills for answering this on the #performance channel on the Gophers slack.

huangapple
  • 本文由 发表于 2022年10月12日 12:22:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/74036473.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定