按字节比较变长编码的int64数值。

huangapple go评论88阅读模式
英文:

bytewise compare varint encoded int64's

问题

我正在使用levigo,这是Go语言的leveldb绑定。我的键是int64类型的,需要保持排序。默认情况下,leveldb使用字节比较器,所以我尝试使用varint编码。

func i2b(x int64) []byte {
    b := make([]byte, binary.MaxVarintLen64)
    n := binary.PutVarint(b, x)
    return key[:n]
}

我的键没有被正确排序。我写了以下代码进行测试。

var prev int64 = 0
for i := int64(1); i < 1e5; i++ {
    if bytes.Compare(i2b(i), i2b(prev)) <= 0 {
        log.Fatalf("bytewise: %d > %d", b2i(prev), i)
    }
    prev = i
}

输出: bytewise: 127 > 128

playground

我不确定问题出在哪里。我是否对编码有误?varint是否不是正确的编码方式?

编辑:

使用BigEndian固定宽度编码可以进行字节比较。

func i2b(x int64) []byte {
    b := make([]byte, 8)
    binary.BigEndian.PutUint64(b, uint64(x))
    return b
}
英文:

I'm using levigo, the leveldb bindings for Go. My keys are int64's and need to be kept sorted. By default, leveldb uses a bytewise comparator so I'm trying to use varint encoding.

func i2b(x int64) []byte {
	b := make([]byte, binary.MaxVarintLen64)
	n := binary.PutVarint(b, x)
	return key[:n]
}

My keys are not being sorted correctly. I wrote the following as a test.

var prev int64 = 0
for i := int64(1); i &lt; 1e5; i++ {
	if bytes.Compare(i2b(i), i2b(prev)) &lt;= 0 {
		log.Fatalf(&quot;bytewise: %d &gt; %d&quot;, b2i(prev), i)
	}
	prev = i
}

output: bytewise: 127 &gt; 128

playground

I'm not sure where the problem is. Am I doing the encoding wrong? Is varint not the right encoding to use?

EDIT:

BigEndian fixed width encoding is bytewise comparable

func i2b(x int64) []byte {
  b := make([]byte, 8)
  binary.BigEndian.PutUint64(b, uint64(x))
  return b
}

答案1

得分: 1

varint编码在值的顺序方面与字节顺序不可比较。一个编写排序/比较函数(下面的cmp)的选项是:

package main

import (
	"encoding/binary"
	"log"
)

func i2b(x int64) []byte {
	var b [binary.MaxVarintLen64]byte
	return b[:binary.PutVarint(b[:], x)]
}

func cmp(a, b []byte) int64 {
	x, n := binary.Varint(a)
	if n < 0 {
		log.Fatal(n)
	}

	y, n := binary.Varint(b)
	if n < 0 {
		log.Fatal(n)
	}

	return x - y
}

func main() {
	var prev int64 = 0
	for i := int64(1); i < 1e5; i++ {
		if cmp(i2b(i), i2b(prev)) <= 0 {
			log.Fatal("fail")
		}
		prev = i
	}
}

Playground

(*)原因是(也是)执行的位操作。

英文:

The varint encoding is not bytewise comparable* wrt to the order of the values it caries. One option how to write the ordering/collating function (cmp bellow) is for example:

package main

import (
        &quot;encoding/binary&quot;
        &quot;log&quot;
)

func i2b(x int64) []byte {
        var b [binary.MaxVarintLen64]byte
        return b[:binary.PutVarint(b[:], x)]
}

func cmp(a, b []byte) int64 {
        x, n := binary.Varint(a)
        if n &lt; 0 {
                log.Fatal(n)
        }

        y, n := binary.Varint(b)
        if n &lt; 0 {
                log.Fatal(n)
        }

        return x - y
}

func main() {
        var prev int64 = 0
        for i := int64(1); i &lt; 1e5; i++ {
                if cmp(i2b(i), i2b(prev)) &lt;= 0 {
                        log.Fatal(&quot;fail&quot;)
                }
                prev = i
        }
}

Playground

(*) The reason is (also) the bit fiddling performed.

huangapple
  • 本文由 发表于 2013年5月8日 22:01:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/16442730.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定