英文:
Variable length two's complement to int64
问题
我正在尝试编写一个Go程序来解析ans.1 BER的二进制补码整数编码。然而,这个整数可以有1、2、3或4个字节的长度编码(取决于其大小)。
根据规范(http://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf),最左边的位始终是补码。
有没有一种简洁的方法来实现这个?
func ParseInt(b []byte) (int64, error) {
    switch len(b) {
    case 1:
        // 这个可以工作
        return int64(b[0]&0x7f) - int64(b[0]&0x80), nil
    case 2:
        // b[0]的最左边字节是-32768
    case 3:
        // b[0]的最左边字节是-8388608
    case 4:
        // b[0]的最左边字节是-2147483648(以此类推,5、6、7、8)
    case 5:
    case 6:
    case 7:
    case 8:
    default:
        return 0, errors.New("值无法适应int64")
    }
}
ParseInt([]byte{0xfe})       // 应返回(-2, nil)
ParseInt([]byte{0xfe, 0xff}) // 应返回(-257, nil)
ParseInt([]byte{0x01, 0x00}) // 应返回(256, nil)
英文:
I'm attempting to write a Go program to parse ans.1 BER two's complement integer encoding. However, the integer can have either 1, 2, 3 or 4 byte length encoding (depending on its size).
According to the specification (http://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf) the leftmost bit is always the complement.
What's a clean way to do this?
func ParseInt(b []byte) (int64, error) {
    switch len(b) {
    case 1:
        // this works
        return int64(b[0]&0x7f) - int64(b[0]&0x80), nil
    case 2:
        // left most byte of b[0] is -32768
    case 3:
        // left most byte of b[0] is -8388608
    case 4:
        // left most byte of b[0] is -2147483648 (and so on for 5, 6, 7, 8)
    case 5:
    case 6:
    case 7:
    case 8:
    default:
        return 0, errors.New("value does not fit in a int64")
    }
}
ParseInt([]byte{0xfe})       // should return (-2, nil)
ParseInt([]byte{0xfe, 0xff}) // should return (-257, nil)
ParseInt([]byte{0x01, 0x00}) // should return (256, nil)
答案1
得分: 2
更容易理解,如果你从末尾读取字节:
- 你不需要移动最后一个字节
 - 将最后一个字节左移8位(一个字节有8位)
 - 将倒数第二个字节左移16位
 - ...
 - 从第一个字节开始,只使用7位,最左边的一位是特殊位。
 
第一个字节的最左边一位 b[0]&080 表示你是否需要给结果添加一个偏移量。可选择添加的偏移量是 -1 乘以你的输入所表示的数字,其中只有这一位为1,其他位都为0,即 -1 * (1 << (len(b)*8 - 1)) = 0x80 << (len(b)*8 - 8)。
示例。如果输入是...
- 1个字节:
int64(b[0]&0x7f) - int64(b[0]&0x80) - 2个字节:
int64(b[0]&0x7f)<<8 + int64(b[1]) - int64(b[0]&0x80)<<8 - 3个字节:
int64(b[0]&0x7f)<<16 + int64(b[1])<<8 + int64(b[2]) - int64(b[0]&0x80)<<16 
所有这些情况都可以用一个循环来处理。
这是一个简洁的实现(在Go Playground上尝试一下):
func ParseInt(b []byte) (int64, error) {
    if len(b) > 8 {
        return 0, errors.New("value does not fit in a int64")
    }
    var n int64
    for i, v := range b {
        shift := uint((len(b) - i - 1) * 8)
        if i == 0 && v&0x80 != 0 {
            n -= 0x80 << shift
            v &= 0x7f
        }
        n += int64(v) << shift
    }
    return n, nil
}
英文:
Easier to understand if you read the bytes from the end:
- You don't have to shift the last byte
 - Left-shift the last byte by 8 (8 bits in a byte)
 - Left-shift the 2nd last byte by 16
 - ...
 - And from the first byte only use 7 bits, the leftmost bit is special.
 
The leftmost bit of the first byte b[0]&080 tells if you have to add an offset to the result. The offset to be optionally added is -1 multiplied by the number your input would mean by having this one bit set and all others being 0, that is -1 * (1 << (len(b)*8 - 1)) = 0x80 << (len(b)*8 - 8).
Examples. If input is...
- 1 byte:
int64(b[0]&0x7f) - int64(b[0]&0x80) - 2 bytes:
int64(b[0]&0x7f)<<8 + int64(b[1]) - int64(b[0]&0x80)<<8 - 3 bytes:
int64(b[0]&0x7f)<<16 + int64(b[1])<<8 + int64(b[2]) - int64(b[0]&0x80)<<16 
All these cases can be covered with a nice loop.
Here's a compact implementation (try it on the Go Playground):
func ParseInt(b []byte) (int64, error) {
    if len(b) > 8 {
        return 0, errors.New("value does not fit in a int64")
    }
    var n int64
    for i, v := range b {
        shift := uint((len(b) - i - 1) * 8)
        if i == 0 && v&0x80 != 0 {
            n -= 0x80 << shift
            v &= 0x7f
        }
        n += int64(v) << shift
    }
    return n, nil
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论