在int64中使用位掩码多个值。

huangapple go评论71阅读模式
英文:

Bitmask multiple values in int64

问题

我正在使用https://github.com/coocood/freecache来缓存数据库结果,但目前我需要在每次删除时转储更大的块,这比目标删除多花费了多个微秒。对于类似#SUBJECT_#ID1_#ID2的模式,fmt.Sprintf("%d_%d_%d")也会花费多个微秒。尽管听起来不多,但在当前缓存响应时间的比率下,它比当前速度慢得多。

我在考虑使用该库的SetInt/GetInt,它使用int64键而不是字符串。

假设我正在以#SUBJECT_#ID1_#ID2的模式存储。Subject是我的代码中的一个表或查询段范围(例如,与ACL或产品过滤相关的所有内容)。

让我们以Userright.id#ID1User.id#ID2SubjectACL为例。我会构建如下:

// const CACHE_SUBJECT_ACL = 0x1
// var userrightID int64 = 0x1
// var userID int64 = 0x1
var storeKey int64 = 0x1000000101

fmt.Println("Range: ", storeKey&0xff)
fmt.Println("ID1  : ", storeKey&0xfffffff00-0xff)
fmt.Println("ID2  : ", storeKey&0x1fffffff00000000-0xfffffffff)

如何将CACHE_SUBJECT_ACL/userrightID/userID编译到storeKey中?

我知道我可以将userrightID称为0x100000001,但它是一个动态值,所以我不确定在不引起比格式化字符串更多开销的情况下编译它的最佳方法是什么。

这个想法是,在以后的阶段,当我需要刷新缓存时,我可以调用一小段int64调用,而不仅仅是转储整个分区(可能有数千个条目)。

我在考虑将它们相加并进行位移,例如userID<<8,但我不确定这是否是安全的方法。

如果我没有提供足够的信息,请提问。

英文:

I'm using https://github.com/coocood/freecache to cache database results, but currently I need to dump bigger chunks on every delete, which costs multiple microseconds extra compared to targeted deletion. fmt.Sprintf(&quot;%d_%d_%d&quot;) for a pattern like #SUBJECT_#ID1_#ID2 also costs multiple microseconds. Even tho that doesn't sound like much, in the current ratio of the cache's response time that a multitude slower than it currently is.

I was thinking of using the library's SetInt/GetInt which works with int64 keys instead of strings.

So let's say I'm storing in a #SUBJECT_#ID1_#ID2 pattern. The Subject is a table or query-segment-range in my code (e.a. everything concern ACL or Productfiltering).

Let's take an example of Userright.id is #ID1 and User.id is #ID2 and Subject ACL. I would build it as something like this:

// const CACHE_SUBJECT_ACL = 0x1
// var userrightID int64 = 0x1
// var userID int64 = 0x1
var storeKey int64 = 0x1000000101

fmt.Println(&quot;Range: &quot;, storeKey&amp;0xff)
fmt.Println(&quot;ID1  : &quot;, storeKey&amp;0xfffffff00-0xff)
fmt.Println(&quot;ID2  : &quot;, storeKey&amp;0x1fffffff00000000-0xfffffffff)

How can I compile the CACHE_SUBJECT_ACL/userrightID/userID into the storeKey?

I know I can call userrightID 0x100000001, but it's a dynamic value so I'm not sure what's the best way to compile this without causing more overhead than formatting the string as a key.

The idea is that in a later state when I need to flush the cache I can call a small range of int64 calls instead of just dumping a whole partition (of maybe thousands of entries).

I was thinking of adding them to each other with bit shifting, like userID&lt;&lt;8, but I'm not sure if that's the safe route.

If I failed to supply enough information, please ask.

答案1

得分: 1

将数字打包到int64中

如果我们可以确保要打包的数字不是负数,并且它们适合我们为其保留的位范围内,那么是的,这是一种安全有效的打包方式。

一个int64有64位,这就是我们可以分配给要打包的部分的位数。通常不使用符号位以避免混淆,或者使用无符号版本uint64。

例如,如果我们为subject保留8位,那么剩下的位数就是64-8=56位,每个ID有28位。

                   | ID2         | ID1         |SUB|
编码后的键位:       |f f f f f f f|f f f f f f f|f f|

请注意,在编码时,建议还使用位掩码和按位与操作,以确保我们打包的数字不重叠(有争议,因为如果组件更大,我们无论如何都会出错...)。

还要注意,如果我们还使用了符号位(第63位),在解码时必须在位移后应用掩码,因为向右位移“带入”符号位而不是0(在负数的情况下,符号位为1)。

由于我们为ID1和ID2都使用了28位,我们可以为两个ID使用相同的掩码:

使用以下简短的实用函数来完成任务:

const (
	maskSubj = 0xff
	maskId   = 0xfffffff
)

func encode(subj, id1, id2 int64) int64 {
	return subj&maskSubj | (id1&maskId)<<8 | (id2&maskId)<<36
}

func decode(key int64) (sub, id1, id2 int64) {
	return key & maskSubj, (key >> 8) & maskId, (key >> 36) & maskId
}

进行测试:

key := encode(0x01, 0x02, 0x04)
fmt.Printf("%016x\n", key)
fmt.Println(decode(key))

输出结果(在Go Playground上尝试):

0000004000000201
1 2 4

保持使用string

最初你探索将数字打包到int64中,因为fmt.Sprintf()很慢。请注意,Sprintf()使用一个格式字符串,并且需要时间来解析格式字符串并根据格式字符串中规定的“规则”格式化参数。

但在你的情况下,我们不需要这样做。我们可以通过以下方式简单地获得你最初想要的结果:

id2, id1, subj := 0x04, 0x02, 0x01
key := fmt.Sprint(id2, "_", id1, "_", subj)
fmt.Println(key)

输出结果:

4_2_1

这种方法会更快,因为它不需要处理格式字符串,只需连接参数即可。

我们甚至可以做得更好;如果相邻的两个参数都不是string值,那么会自动插入一个空格,所以只需列出数字即可:

key = fmt.Sprint(id2, id1, subj)
fmt.Println(key)

输出结果:

4 2 1

Go Playground上尝试这些代码。

利用fmt.AppendInt()

通过使用fmt.AppendInt(),我们可以进一步改进。该函数将整数的文本表示附加到字节切片中。我们可以使用基数16,这样表示会更紧凑,而且将一个数字转换为基数16的算法比转换为基数10的算法更快:

func encode(subj, id1, id2 int64) string {
	b := make([]byte, 0, 20)

	b = strconv.AppendInt(b, id2, 16)
	b = append(b, '_')
	b = strconv.AppendInt(b, id1, 16)
	b = append(b, '_')
	b = strconv.AppendInt(b, subj, 16)

	return string(b)
}

进行测试:

id2, id1, subj := int64(0x04), int64(0x02), int64(0x01)
key := encode(subj, id1, id2)
fmt.Println(key)

输出结果(在Go Playground上尝试):

4_2_1
英文:

Packing numbers to an int64

If we can make sure the numbers we want to pack are not negative and they fit into the bit range we're reserving for them, then yes, this is a safe and efficient way to pack them.

An int64 has 64 bits, that's how many we can assign to the parts we want to pack into it. Often the sign bit is not used to avoid confusion, or an unsigned version uint64 is used.

For example, if we reserve 8 bits for subject, that leaves 64-8=56 bits for the rest, 28 bits for each ID.

                   | ID2         | ID1         |SUB|
Encoded key bits:  |f f f f f f f|f f f f f f f|f f|

Note that when encoding, it's recommended to also use a bitmask with bitwise AND to make sure the numbers we pack do not overlap (arguable, because if the components are bigger, we're screwed anyway...).

Also note that if we're also using the sign bit (63<sup>th</sup>), we have to apply masking after the bitshift when decoding, as shifting right "brings in" the sign bit and not 0 (sign bit is 1 in case of negative numbers).

Since we used 28 bits for both ID1 and ID2, we can use the same mask for both IDs:

Use these short utility functions which get the job done:

const (
	maskSubj = 0xff
	maskId   = 0xfffffff
)

func encode(subj, id1, id2 int64) int64 {
	return subj&amp;maskSubj | (id1&amp;maskId)&lt;&lt;8 | (id2&amp;maskId)&lt;&lt;36
}

func decode(key int64) (sub, id1, id2 int64) {
	return key &amp; maskSubj, (key &gt;&gt; 8) &amp; maskId, (key &gt;&gt; 36) &amp; maskId
}

Testing it:

key := encode(0x01, 0x02, 0x04)
fmt.Printf(&quot;%016x\n&quot;, key)
fmt.Println(decode(key))

Output (try it on the Go Playground):

0000004000000201
1 2 4

Sticking to string

Originally you explored packing into an int64 because fmt.Sprintf() was slow. Note that Sprintf() uses a format string, and it takes time to parse the format string and format the arguments according to the "rules" laid out in the format string.

But in your case we don't need this. We can simply get what you originally wanted like this:

id2, id1, subj := 0x04, 0x02, 0x01
key := fmt.Sprint(id2, &quot;_&quot;, id1, &quot;_&quot;, subj)
fmt.Println(key)

Output:

4_2_1

This one will be significantly faster as it doesn't have to process a format string, it will just concatenate the arguments.

We can even do better; if none of 2 arguments being next to each other are string values, a space is automatically inserted, so it's really enough to just list the numbers:

key = fmt.Sprint(id2, id1, subj)
fmt.Println(key)

Output:

4 2 1

Try these on the Go Playground.

Utilizing fmt.AppendInt()

We can improve it further by using fmt.AppendInt(). This function appends the textual representation of an integer to a byte slice. We can use base 16 so we'll have more compact representation, and also because the algorithm to convert a number to base 16 is faster than to base 10:

func encode(subj, id1, id2 int64) string {
	b := make([]byte, 0, 20)

	b = strconv.AppendInt(b, id2, 16)
	b = append(b, &#39;_&#39;)
	b = strconv.AppendInt(b, id1, 16)
	b = append(b, &#39;_&#39;)
	b = strconv.AppendInt(b, subj, 16)

	return string(b)
}

Testing it:

id2, id1, subj := int64(0x04), int64(0x02), int64(0x01)
key := encode(subj, id1, id2)
fmt.Println(key)

Output (try it on the Go Playground):

4_2_1

答案2

得分: 0

似乎已经弄清楚了:

const CacheSubjectACL = 1
var userrightID int64 = 8
var userID int64 = 2
storeKey := CacheSubjectACL + (userrightID << 8) + (userID << 36)

fmt.Println("storeKey: ", storeKey)
fmt.Println("Range   : ", storeKey&0xff)
fmt.Println("ID1     : ", storeKey&0xfffffff00>>8)
fmt.Println("ID2     : ", storeKey&0x1ffffff000000000>>36)

输出结果为:

storeKey:  137438955521
Range   :  1
ID1     :  8
ID2     :  2

storeKey 构建了被掩码的 int64。通过与新的位移相反的掩码操作,可以从 int64 中提取出旧的值。

因为 storeKey&0x1ffffff000000000>>36 最终会到达最右边,所以 storeKey>>36 也足够了,因为在更左边没有位。

英文:

Seemed to have figured it out:

const CacheSubjectACL = 1
var userrightID int64 = 8
var userID int64 = 2
storeKey := CacheSubjectACL + (userrightID &lt;&lt; 8) + (userID &lt;&lt; 36)

fmt.Println(&quot;storeKey: &quot;, storeKey)
fmt.Println(&quot;Range   : &quot;, storeKey&amp;0xff)
fmt.Println(&quot;ID1     : &quot;, storeKey&amp;0xfffffff00&gt;&gt;8)
fmt.Println(&quot;ID2     : &quot;, storeKey&amp;0x1ffffff000000000&gt;&gt;36)

Gives:

 storeKey:  137438955521
 Range   :  1
 ID1     :  8
 ID2     :  2

storeKey builds the int64 masked. And the masking with a new shift the other way around fishes the old values out of the int64 again.

Because storeKey&amp;0x1ffffff000000000&gt;&gt;36 runs to the end anyway, storeKey&gt;&gt;36 will suffice too as there are no bits on the further left.

huangapple
  • 本文由 发表于 2017年1月22日 19:56:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/41790574.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定