2020年1月6日 23:45:34go评论79阅读模式

英文:

Are bitmask checks more efficient than comparing numbers?

问题

"Bitmask-based" check:

如果 (val & IMPORTANT_BIT) ...

Number comparison:

如果 (val == IMPORTANT_VAL) ...

在这两种情况下，我觉得都需要从内存中获取 val。那么为什么位掩码会更高效呢？编译器是否对位掩码进行了一些特殊处理？

英文:

I've been told that bitmask-based checks are more efficient than comparing numbers. Is this true?

"Bitmask-based" check:

if (val &amp; IMPORTANT_BIT)
...

Number comparison:

if (val == IMPORTANT_VAL)
...

In both cases I feel like it would have to fetch val out of memory anyways. So why would a bitmask be more efficient? Does the compiler do something fancy for bitmasks?

答案1

得分: 4

以下是翻译好的部分：

所以位域有益的地方是当你有许多布尔标志可以打包进一个单独的字中。测试特定标志的代码是可比的，但是测试特定子模式的标志的代码可以更短，尽管你可能需要自己编写测试：

// 使用布尔值
bool a, b, c, d;
if (a && !b && c && !d) ...

// 希望编译器知道我们在做什么
enum { A= 0x01, B= 0x02, C= 0x04, D= 0x08};
if ((flags&A) && !(flags&B) && (flags&C) && !(flags&D)) ...

// 自己优化：
enum { A= 0x01, B= 0x02, C= 0x04, D= 0x08};
if ((flags & (A|B|C|D)) == (A|!B|C|!D)) ...

在第一种情况下，编译器必须加载每个位置，尽管它可以提前退出。在后面的情况下，至少它可以加载一次值并在内核中执行多个操作。一个好的优化器可以将第二种模式优化为第三种模式，这是2条指令，尽管优化器还可能意识到它可以将所有布尔值作为“长”加载以减少带宽，而且它们至少会都在同一个缓存行中，这在今天几乎和在核心中一样好。

但无论如何，第三种形式在简洁性方面都会胜过第一种形式，而且还可以节省一些存储空间。

请注意，您的两个示例没有测试相同的内容。

"A & 5"测试4位或1位已设置，但忽略2位和所有更高位。

"A == 5"确实测试了1位和4位是否已设置，但它还检查了所有其他位是否清除。

英文:

So the place where bitfields are beneficial is where you have a lot of bool flags that you can pack into a single word. The code for testing one particular flag is comparable, but the code for testing for a particular subpattern of flags can be much shorter, though you might need to write the test yourself:

// using bool
bool a,b,c,d;
if (a &amp;&amp; !b &amp;&amp; c &amp;&amp; !d) ...
// hoping the compiler knows what we are doing
enum { A= 0x01, B= 0x02, C= 0x04, D= 0x08};
if ((flags&amp;A) &amp;&amp; !(flags&amp;B) &amp;&amp; (flags&amp;C) &amp;&amp; !(flags&amp;D)) ...
// Optimise it ourselves:
enum { A= 0x01, B= 0x02, C= 0x04, D= 0x08};
if ((flags &amp; (A|B|C|D)) == (A|!B|C|!D)) ...

In the first case, the compiler must load each of the locations, though it could early-out. In the latter cases, at minimum it can load the value once and do multiple operations in core. A good optimiser could optimise the second pattern to the third, which is 2 instructions, though the optimiser might also realise it can load all the bools as a "long" to reduce bandwidth, and they would at least all be in the same cache line, which is almost as good as being in core these days.

But in any case, the 3rd form will win vs. the first form for brevity, as well as saving a little storage.

Note that your two examples do not test the same thing.

A & 5 tests that either the 4-bit or the 1-bit are set, but ignores the 2-bit, and any higher bits completely.

A == 5 does test that the 1-bit and 4-bit are set, but it also checks that ALL OTHER bits are clear.

答案2

得分: 4

以下是翻译好的内容：

首先，我们通过假设 IMPORTANT_BIT == IMPORTANT_VAL 并且 val 的所有其他位都为 0 来限制问题，以便两个测试的结果相同。

简短回答：不用担心。

长篇回答：

这取决于目标架构、IMPORTANT_BIT 的值以及代码执行的上下文。

x86 架构具有方便设置 Z 标志的 AND 指令，因此可以在单条指令中对值进行掩码与测试。然后还有 CMP 指令，用于比较值。AND 和 CMP 指令的延迟和吞吐量均为 1。因此，在 x86 上几乎没有区别，除非编译器能够重新使用某些特定值并与上下文中的代码混合。性能差异将非常小。

尽管如此，& 和 == 测试的语义确实有很大不同。

使用 if (val & ... 测试来掩码位。
使用 if (val == ... 测试来比较值。

我建议首先从代码可读性的角度来考虑。如果 val 不是位字段，请不要使用 &。

英文:

First of all, we restrict the question by assuming IMPORTANT_BIT == IMPORTANT_VAL and all other bits of val are 0, so that the outcome of both tests is the same.

Short answer: Don't worry about it.

Long answer:

It depends on the target architecture, the value of IMPORTANT_BIT and the context in which the code is executed.

x86 has the AND instruction which conveniently sets the Z flag, and so can be used to mask-and-test a value against 0 in a single instruction. Then there's the CMP instruction, which compares values. Both AND and CMP instructions have the same latency of 1 and throughput 4. So on x86 there would be no difference, unless the compiler can manage to re-use some specific value and blend with the code above/below. The performance difference would be marginal though.

Having said that, the semantics of & and == tests is really different.

if (val & ... test is used to mask bits.
if (val == ... test is used to compare values.

I suggest to approach it from a code readability perspective first of all. If val is not a bit field, do yourself a favor and don't use & against it.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

比特掩码检查是否比比较数字更高效？

问题

答案1

答案2

嵌套的 switch case 中的情况不会运行。

识别特定节点的所有子节点，适用于非常大的数据。

如何调试使用cgo调用的C代码？

如何更高效地创建这样的矩阵？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。