2023年6月1日 12:27:20go评论108阅读模式

英文:

The fastest way to convert a UInt64 hex string to a UInt32 value preserving as many leading digits as possible, i.e. truncation

问题

"我正在寻找将表示ulong的十六进制字符串解析为uint的最快方法，保留尽可能多的前导数字并丢弃其余部分。例如，

string hex = "0xab54a9a1df8a0edb"; // 12345678991234567899
应该输出：uint result = 1234567899;

我可以通过简单地将十六进制解析为ulong，使用ToString获取数字，然后只取尽可能多的数字以适应uint而不会溢出，但我需要更快的方法。谢谢。最好使用C#代码，但任何代码都可以。"

英文:

I'm looking for the fastest way to parse a hex string representing a ulong into a uint keeping as many leading digits as a uint can handle and discarding the rest. For example,

string hex = "0xab54a9a1df8a0edb"; // 12345678991234567899
Should output: uint result = 1234567899;

I can do this by simply parsing the hex into a ulong, getting the digits using ToString and then just taking as many of them as would fit into uint without overflowing but I need something much faster. Thanks. C# code preferred but any would do.

答案1

得分: 2

对于十进制截断，十六进制数字的高位影响低9或10个十进制数字，因此您需要将整个内容转换。https://stackoverflow.com/questions/67054154/is-there-an-algorithm-to-convert-massive-hex-string-to-bytes-stream-quickly-asm/67169220#67169220 具有使用SSE指令集的C++代码。我在那里评论了一些可能的改进，以及https://github.com/zbjornson/fast-hex。如果您正在使用SIMD在较大的缓冲区中查找数字文字，那么这可能特别有用，因此您可能已经在SIMD寄存器中具有十六进制字符串。（不确定SIMDJSON是否这样做。）

将十六进制字符串转换为64位整数确实可以通过SIMD来加速，例如将每个数字映射到0-15的整数，将字节对组合以打包半字节（例如使用x86的pmaddubsw），然后将这些8位块洗牌到寄存器的底部。（例如packuswb或pshufb）。至少在x86上，将SIMD有效地移动到通用目的整数寄存器movq rax, xmm0，尽管一些ARM CPU上的ARM等效性较慢。

（如果您的字符串长度固定，并且可能不需要检查不是十六进制数字的无效字符，那么通过SIMD加速ASCII十六进制->无符号整数的速度提升要容易得多。）

将`u64`（C# `ulong`）的十进制截断以适应`u32`（C# `uint`）

对某个10的幂取模将截断为一定数量的十进制数字。

对于某些数字，(uint)(x % 10000000000) 可以工作，但是10000000000（1e10 = 后面跟着10个零的1）大于2^32-1。考虑输入像0x2540be3ff（9999999999）。我们将得到(uint)9999999999 产生 1410065407 = 0x540be3ff（保留那个34位数字的低32位）。

因此，也许尝试模10的10次方，但如果它对于u32来说太大了，那就模10的9次方。

ulong tendigit = x % 10000000000;  // 1e10
uint truncated = tendigit <= (ulong)0xffffffff ? (uint)tendigit : (uint)(x % 1000000000);  // % 1e9 保留9位十进制数字

如果这不是正确的C#语法或文字需要一些修饰以使它们成为ulong（例如C的10000000000uLL），请告诉我。

直接对原始数字进行两种不同方式的模运算可能至少与尝试获取x % 1e10的前导十进制数字并减去它或其他操作效率相当。汇编语言将需要两个64位乘法逆元常数，从原始数字重新开始，如果分支预测需要计算九位截断，将关键路径延迟保持较短。

二进制截断

@Matthew Whited删除了他的回答（由于十进制截断部分的错误），但基于原始十六进制输入的子字符串的二进制截断部分在某些情况下可能比进行完全转换然后强制转换为较窄类型或使用AND进行掩码更高效。

如果您想要十六进制字符串的最后8个字节

uint.Parse(hex[^8..], NumberStyles.HexNumber)

如果您想要前8个字节

uint.Parse(hex[2..10], NumberStyles.HexNumber);

英文:

For decimal truncation, all the high bits of the hex digit affect the low 9 or 10 decimal digits, so you need to convert the whole thing. https://stackoverflow.com/questions/67054154/is-there-an-algorithm-to-convert-massive-hex-string-to-bytes-stream-quickly-asm/67169220#67169220 has C++ with SSE intrinsics. I commented there with some possible improvements to that, and to https://github.com/zbjornson/fast-hex . This could be especially good if you're using SIMD to find numeric literals in larger buffers, so you might have the hex string in a SIMD register already. (Not sure if SIMDJSON does that.)

Hex-string to 64-bit integer is something SIMD certainly can speed up, e.g. do something to map each digit to a 0-15 integer, combine pairs of bytes to pack nibbles (e.g. with x86 pmaddubsw), then shuffle those 8-bit chunks to the bottom of a register. (e.g. packuswb or pshufb). x86 at least has efficient SIMD to GP-integer movq rax, xmm0, although the ARM equivalent is slow on some ARM CPUs.

(Getting a speedup from SIMD for ASCII hex -> uint is much easier if your strings are fixed-length, and probably if you don't need to check for invalid characters that aren't hex digits.)

Decimal truncation of `u64` (C# `ulong`) to fit in `u32` (C# `uint`)

Modulo by a power of 10 truncates to some number of decimal digits.

(uint)(x % 10000000000) works for some numbers, but 10000000000 (1e10 = one followed by 10 zeros) is larger than 2^32-1. Consider an input like 0x2540be3ff (9999999999). We'd get (uint)9999999999 producing 1410065407 = 0x540be3ff (keeping the low 32 bits of that 34-bit number.)

So perhaps try modulo 1e10, but if it's too big for u32 then modulo 1e9.

  ulong tendigit = x % 10000000000;  // 1e10
  uint truncated = tendigit &lt;= (ulong)0xffffffff ? tendigit : (x % 1000000000);  // % 1e9 keeps 9 decimal digits

If this isn't correct C# syntax or the literals need some decoration to make them ulong (like C 10000000000uLL for good measure), please let me know.

It's probably at least as efficient to just modulo the original number two different ways than to try to get the leading decimal digit of x % 1e10 and subtract it or whatever. The asm is going to need two 64-bit multiplicative inverse constants, and starting from the original number again keeps critical-path latency shorter for out-of-order exec if branch prediction predicts that it needs to calculate the nine-digit truncation.

Binary truncation

@Matthew Whited deleted his answer (due to a bug in the decimal truncation part), but his binary truncation part based on substrings of the original hex input could perhaps be more efficient in some cases than doing the full conversion and then casting to a narrower type or masking with AND.

> If you want the last 8 bytes of the hex string
>
> uint.Parse(hex[^8..],NumberStyles.HexNumber)
>
> If you want the first 8 bytes
>
> uint.Parse(hex[2..10], NumberStyles.HexNumber);

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

The fastest way to convert a UInt64 hex string to a UInt32 value preserving as many leading digits as possible, i.e. truncation

问题

答案1

将`u64`（C# `ulong`）的十进制截断以适应`u32`（C# `uint`）

二进制截断

Decimal truncation of `u64` (C# `ulong`) to fit in `u32` (C# `uint`)

Binary truncation

将消息保存到控制台日志中的一个类中

如何在不借助客户端评估的情况下对一个包含分隔符的列进行LINQ-SQL查询。

从列表或数组中筛选记录

Error:System.FormatException: ‘输入字符串的格式不正确。’

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论

问题

答案1

将u64（C# ulong）的十进制截断以适应u32（C# uint）

二进制截断

Decimal truncation of u64 (C# ulong) to fit in u32 (C# uint)

Binary truncation

发表评论

将`u64`（C# `ulong`）的十进制截断以适应`u32`（C# `uint`）

Decimal truncation of `u64` (C# `ulong`) to fit in `u32` (C# `uint`)