英文:
Are bitmask checks more efficient than comparing numbers?
问题
"Bitmask-based" check:
如果 (val & IMPORTANT_BIT) ...
Number comparison:
如果 (val == IMPORTANT_VAL) ...
在这两种情况下,我觉得都需要从内存中获取 val
。那么为什么位掩码会更高效呢?编译器是否对位掩码进行了一些特殊处理?
英文:
I've been told that bitmask-based checks are more efficient than comparing numbers. Is this true?
"Bitmask-based" check:
if (val & IMPORTANT_BIT)
...
Number comparison:
if (val == IMPORTANT_VAL)
...
In both cases I feel like it would have to fetch val
out of memory anyways. So why would a bitmask be more efficient? Does the compiler do something fancy for bitmasks?
答案1
得分: 4
以下是翻译好的部分:
所以位域有益的地方是当你有许多布尔标志可以打包进一个单独的字中。测试特定标志的代码是可比的,但是测试特定子模式的标志的代码可以更短,尽管你可能需要自己编写测试:
// 使用布尔值
bool a, b, c, d;
if (a && !b && c && !d) ...
// 希望编译器知道我们在做什么
enum { A= 0x01, B= 0x02, C= 0x04, D= 0x08};
if ((flags&A) && !(flags&B) && (flags&C) && !(flags&D)) ...
// 自己优化:
enum { A= 0x01, B= 0x02, C= 0x04, D= 0x08};
if ((flags & (A|B|C|D)) == (A|!B|C|!D)) ...
在第一种情况下,编译器必须加载每个位置,尽管它可以提前退出。在后面的情况下,至少它可以加载一次值并在内核中执行多个操作。一个好的优化器可以将第二种模式优化为第三种模式,这是2条指令,尽管优化器还可能意识到它可以将所有布尔值作为“长”加载以减少带宽,而且它们至少会都在同一个缓存行中,这在今天几乎和在核心中一样好。
但无论如何,第三种形式在简洁性方面都会胜过第一种形式,而且还可以节省一些存储空间。
请注意,您的两个示例没有测试相同的内容。
"A & 5"测试4位或1位已设置,但忽略2位和所有更高位。
"A == 5"确实测试了1位和4位是否已设置,但它还检查了所有其他位是否清除。
英文:
So the place where bitfields are beneficial is where you have a lot of bool flags that you can pack into a single word. The code for testing one particular flag is comparable, but the code for testing for a particular subpattern of flags can be much shorter, though you might need to write the test yourself:
// using bool
bool a,b,c,d;
if (a && !b && c && !d) ...
// hoping the compiler knows what we are doing
enum { A= 0x01, B= 0x02, C= 0x04, D= 0x08};
if ((flags&A) && !(flags&B) && (flags&C) && !(flags&D)) ...
// Optimise it ourselves:
enum { A= 0x01, B= 0x02, C= 0x04, D= 0x08};
if ((flags & (A|B|C|D)) == (A|!B|C|!D)) ...
In the first case, the compiler must load each of the locations, though it could early-out. In the latter cases, at minimum it can load the value once and do multiple operations in core. A good optimiser could optimise the second pattern to the third, which is 2 instructions, though the optimiser might also realise it can load all the bools as a "long" to reduce bandwidth, and they would at least all be in the same cache line, which is almost as good as being in core these days.
But in any case, the 3rd form will win vs. the first form for brevity, as well as saving a little storage.
Note that your two examples do not test the same thing.
A & 5
tests that either the 4-bit or the 1-bit are set, but ignores the 2-bit, and any higher bits completely.
A == 5
does test that the 1-bit and 4-bit are set, but it also checks that ALL OTHER bits are clear.
答案2
得分: 4
以下是翻译好的内容:
首先,我们通过假设 IMPORTANT_BIT == IMPORTANT_VAL
并且 val
的所有其他位都为 0
来限制问题,以便两个测试的结果相同。
简短回答:不用担心。
长篇回答:
这取决于目标架构、IMPORTANT_BIT
的值以及代码执行的上下文。
x86 架构具有方便设置 Z
标志的 AND
指令,因此可以在单条指令中对值进行掩码与测试。然后还有 CMP
指令,用于比较值。AND
和 CMP
指令的延迟和吞吐量均为 1。因此,在 x86 上几乎没有区别,除非编译器能够重新使用某些特定值并与上下文中的代码混合。性能差异将非常小。
尽管如此,&
和 ==
测试的语义确实有很大不同。
- 使用
if (val & ...
测试来掩码位。 - 使用
if (val == ...
测试来比较值。
我建议首先从代码可读性的角度来考虑。如果 val
不是位字段,请不要使用 &
。
英文:
First of all, we restrict the question by assuming IMPORTANT_BIT == IMPORTANT_VAL
and all other bits of val
are 0
, so that the outcome of both tests is the same.
Short answer: Don't worry about it.
Long answer:
It depends on the target architecture, the value of IMPORTANT_BIT
and the context in which the code is executed.
x86 has the AND
instruction which conveniently sets the Z
flag, and so can be used to mask-and-test a value against 0 in a single instruction. Then there's the CMP
instruction, which compares values. Both AND
and CMP
instructions have the same latency of 1 and throughput 4. So on x86 there would be no difference, unless the compiler can manage to re-use some specific value and blend with the code above/below. The performance difference would be marginal though.
Having said that, the semantics of &
and ==
tests is really different.
if (val & ...
test is used to mask bits.if (val == ...
test is used to compare values.
I suggest to approach it from a code readability perspective first of all. If val
is not a bit field, do yourself a favor and don't use &
against it.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论