问题

My goal is to understand and improve the computation of CRC using clmul instructions in RISCV.
The way described in RISCV bitmanip documentation is following:

4.8 Cyclic redundancy checks (CRC)

There are special instructions for performing CRCs using the two most widespread 32-bit CRC
polynomials, CRC-32 and CRC-32C.
CRCs with other polynomials can be computed efficiently using CLMUL. The following examples
RISC-V Bitmanip Extension V0.93 83
are using CRC32Q.
The easiest way of implementing CRC32Q with clmul is using a Barrett reduction. On RV32:

uint32_t crc32q_simple(const uint32_t *data, int length) {
    uint32_t P = 0x814141AB;    // CRC polynomial (implicit x^32)
    uint32_t mu = 0xFEFF7F62;   // x^64 divided by CRC polynomial
    uint32_t mu1 = 0xFF7FBFB1;  // "mu" with leading 1, shifted right by 1 bit
    uint32_t crc = 0;
    for (int i = 0; i < length; i++) {
        crc ^= rev8(data[i]);
        crc = clmulr(crc, mu1);
        crc = clmul(crc, P);
    }
    return crc;
}

I already know how constants mu and mu1 are computed, but I still have questions:

Why can the original internal loop be replaced with clmul intrinsics? This is not stated in documentation. For reference, this is a generic CRC code sample for CRC-32/AIXM:

uint32_t crc32_bitwise_aixm(const uint8_t* data, size_t length) {   
  uint32_t crc = 0;
  for (int i = 0; i < length; i++) {
    uint8_t byte = data[i];
    for (int j = 0; j < 8; j++) {
      if ((crc ^ (byte << 24)) & 0x80000000)
        crc = (crc << 1) ^ 0x814141ab; // Here is the polynomial from which constants mu and mu1 are computed
      else
        crc = crc << 1;
      byte = byte << 1;
    }
  }
  return crc;
}

I know how these constants are computed, but the question is, WHY are they computed and why passing them to clmul makes a working CRC?
What is the 0x80000000 constant in the original CRC loop? I think it has something to do with masking a portion of number byte << 24.

英文:

My goal is to understand and improve the computation of CRC using clmul instructions in RISCV.
The way described in RISCV bitmanip documentation is following:

4.8 Cyclic redundency checks (CRC)

uint32_t crc32q_simple(const uint32_t *data, int length) {
    uint32_t P = 0x814141AB;    // CRC polynomial (implicit x^32)
    uint32_t mu = 0xFEFF7F62;   // x^64 divided by CRC polynomial
    uint32_t mu1 = 0xFF7FBFB1;  // &quot;mu&quot; with leading 1, shifted right by 1 bit
    uint32_t crc = 0;
    for (int i = 0; i &lt; length; i++) {
        crc ^= rev8(data[i]);
        crc = clmulr(crc, mu1);
        crc = clmul(crc, P);
    }
    return crc;
}

I already know how constants mu and mu1 are computed, but I still have questions:

Why can original internal loop be replaced with clmul intrinsics? This is not stated in documentation. For reference, this is a generic CRC code sample for CRC-32/AIXM:

uint32_t crc32_bitwise_aixm(const uint8_t* data, size_t length) {   
  uint32_t crc = 0;
  for (int i = 0; i &lt; length; i++) {
    uint8_t byte = data[i];
    for (int j = 0; j &lt; 8; j++) {
      if ((crc ^ (byte &lt;&lt; 24)) &amp; 0x80000000)
        crc = (crc &lt;&lt; 1) ^ 0x814141ab; // Here is the polynomial from which constants mu and mu1 are computed
      else
        crc = crc &lt;&lt; 1;
      byte = byte &lt;&lt; 1;
    }
  }
  return crc;
}

I know how this constants are computed, but the question is, WHY are they computed and why passing them to clmul makes working CRC?
What is 0x80000000 constant in original CRC loop? I think it has something to do with masking portion of number byte << 24.

答案1

得分: 1

有太多内容需要在这里解释。由于您甚至不理解0x80000000，您首先需要了解至少循环冗余校验的基本数学以及它们的实际实现。我建议参考Ross William的指南。仅回答您特定问题的简单答案是，它实现了一个线性反馈移位寄存器，向上移位，并通过异或掩码读取顶部位以反馈到其他位。但是，您还需要了解这样的移位寄存器是如何在GF(2)上实现多项式除法的，其中CRC是此类除法的余数。

一旦您理解了所有这些，那么您就准备好了解如何使用乘法而不是除法来执行CRC。您已经提到了这个方法，称为Barrett's Reduction，最初是用于整数除法，但在这里应用于多项式除法。阅读那篇文章。总的来说，您正在将多项式除法替换为多项式的倒数乘法。您的mu是倒数。无进位乘法是多项式乘法，其中寄存器中的每个位都是GF(2)上多项式的系数。

英文:

There is too much to explain in an answer here. Since you don't even understand the 0x80000000, you need to first understand at least the basic mathematics of cyclic redundancy check and importantly their practical implementations. I recommend Ross William's guide. The simple answer for just your specific question is that it is implementing a linear feedback shift register that shifts up, and is reading the top bit to feed back into the other bits using an exclusive-or mask. However you also need to understand how it is that such a shift register implements a polynomial division over GF(2), where a CRC is the remainder of such a division.

Once you understand all that, then you're ready for how to do a CRC using multiplication instead of division. You already named the approach, called Barrett's Reduction, originally for integer division but applied here to polynomial division. Read that post. In gross terms, you are replacing division by a polynomial with multiplication by the reciprocal of that polynomial. Your mu is the reciprocal. Carry-less multiplication is polynomial multiplication, where each bit in a register is the coefficient of a polynomial over GF(2).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用 ‘clmul’ 内置函数来改善在 RISCV 中计算 CRC 的过程？

问题

答案1

内存限制超出子数组问题

Sympy yields `TypeError: unsupported operand type(s) for *: 'interval' and 'complex'` for complex rational expression

快速切换位序的方法是什么？

使用条件运算符进行评估后打印结果

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论