英文:
Arm64 SHA512 intrinsics fail to compile with GCC ("target specific option mismatch")
问题
考虑以下源文件test-sha512.c
,在Arm64(aarch64)上使用SHA512内部函数:
#include <arm_neon.h>
const uint64_t data[256] = {0,};
void test()
{
uint64x2_t a = vld1q_u64(data);
a = vsha512h2q_u64(a, a, a);
}
在Ubuntu 22.10(在MacBook M1上的虚拟机)上,使用gcc 12.2.0编译时,出现错误“在调用‘always_inline’ ‘vsha512h2q_u64’时内联失败”和“目标特定选项不匹配”:
$ gcc -c test-sha512.c -march=armv8-a+sha3
在文件test-sha512.c:1包括:
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h: 在函数‘test’中:
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h:29671:1: 错误: 调用‘always_inline’ ‘vsha512h2q_u64’时内联失败: 目标特定选项不匹配
29671 | vsha512h2q_u64 (uint64x2_t __a, uint64x2_t __b, uint64x2_t __c)
| ^~~~~~~~~~~~~~
test-sha512.c:7:9: 注意:在此调用
7 | a = vsha512h2q_u64(a, a, a);
| ^~~~~~~~~~~~~~~~~~~~~~~
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h:29671:1: 错误: 调用‘always_inline’ ‘vsha512h2q_u64’时内联失败: 目标特定选项不匹配
29671 | vsha512h2q_u64 (uint64x2_t __a, uint64x2_t __b, uint64x2_t __c)
| ^~~~~~~~~~~~~~
test-sha512.c:7:9: 注意:在此调用
7 | a = vsha512h2q_u64(a, a, a);
| ^~~~~~~~~~~~~~~~~~~~~~~
使用clang 15.0.6,它能够正确编译,并且使用Arm64内部函数编译的完整的SHA512实现在clang下也能够正常工作。
$ clang -c test-sha512.c -march=armv8-a+sha3
注意:Arm架构为SHA1、SHA256、SHA512和SHA3定义了不同的特性。然而,gcc和clang只知道crypto
、sha2
和sha3
。SHA512指令(在密码学上属于SHA2的一部分)通过sha3
激活。奇怪。不管怎样...
对于AES、SHA1和SHA256的类似Arm64内部函数,使用gcc编译都是正确的。问题只存在于SHA512。
其他测试均未成功,出现相同错误:
- 错误“目标特定选项不匹配”可能表明对
-march
选项的误解。我尝试了所有Armv8选项(-march=armv8-a+fp+simd+crypto+crc+lse+fp16+rcpc+rdma+dotprod+aes+sha2+sha3+sm4+fp16fml+sve+profile+rng+memtag+sb+ssbs+predres+sve2+sve2-sm4+sve2-aes+sve2-sha3+sve2-bitperm+tme+i8mm+f32mm+f64mm+bf16+flagm+pauth+ls64+mops
)和-march=armv9-a
。 - 使用
-march=native
,考虑到M1支持SHA512。 - 使用
-mcpu=neoverse-v1
或-mcpu=neoverse-n2
或其他已知支持SHA512的Arm核心。 - 各种类型的优化选项。
- 尝试了来自
gcc --target-help
的各种(但不是所有)建议。
这是否是已知错误?我在网上没有找到关于Arm64 SHA512内部函数的任何参考。
编辑:SHA256内部函数也以相同的错误失败。只有SHA1内部函数能够正常工作。我之前使用clang进行SHA256的测试,抱歉。
编辑2:SHA256内部函数使用-march=armv8-a+sha2+crypto
可以正常工作,但使用-march=armv8-a+sha2
则不行。SHA512仍然不工作,即使使用所有-march
选项也一样。
英文:
Consider the following source file test-sha512.c
, using SHA512 intrinsics on Arm64 (aarch64):
#include <arm_neon.h>
const uint64_t data[256] = {0,};
void test()
{
uint64x2_t a = vld1q_u64(data);
a = vsha512h2q_u64(a, a, a);
}
On Ubuntu 22.10 (virtual machine on an MacBook M1), with gcc 12.2.0, I have the error "inlining failed in call to ‘always_inline’" and "target specific option mismatch":
$ gcc -c test-sha512.c -march=armv8-a+sha3
In file included from test-sha512.c:1:
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h: In function ‘test’:
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h:29671:1: error: inlining failed in call to ‘always_inline’ ‘vsha512h2q_u64’: target specific option mismatch
29671 | vsha512h2q_u64 (uint64x2_t __a, uint64x2_t __b, uint64x2_t __c)
| ^~~~~~~~~~~~~~
test-sha512.c:7:9: note: called from here
7 | a = vsha512h2q_u64(a, a, a);
| ^~~~~~~~~~~~~~~~~~~~~~~
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h:29671:1: error: inlining failed in call to ‘always_inline’ ‘vsha512h2q_u64’: target specific option mismatch
29671 | vsha512h2q_u64 (uint64x2_t __a, uint64x2_t __b, uint64x2_t __c)
| ^~~~~~~~~~~~~~
test-sha512.c:7:9: note: called from here
7 | a = vsha512h2q_u64(a, a, a);
| ^~~~~~~~~~~~~~~~~~~~~~~
With clang 15.0.6, it compiles correctly and a full SHA512 implementation using Arm64 intrinsics and compiled with clang works correctly.
$ clang -c test-sha512.c -march=armv8-a+sha3
Note: the Arm architecture defines distinct features for SHA1, SHA256, SHA512 and SHA3. However, gcc and clang know crypto
, sha2
and sha3
only. The SHA512 instructions (cryptographically part of SHA2) are activated with sha3
. Weird. Anyway...
The similar Arm64 intrinsics for AES, SHA1 and SHA256 compile correctly with gcc. The problem is specific to SHA512.
Other tests, without success, same error:
- The error "target specific option mismatch" may suggest a misinterpretation of the
-march
option. I tried with all Armv8 options (-march=armv8-a+fp+simd+crypto+crc+lse+fp16+rcpc+rdma+dotprod+aes+sha2+sha3+sm4+fp16fml+sve+profile+rng+memtag+sb+ssbs+predres+sve2+sve2-sm4+sve2-aes+sve2-sha3+sve2-bitperm+tme+i8mm+f32mm+f64mm+bf16+flagm+pauth+ls64+mops
) and with-march=armv9-a
. - Using
-march=native
, considering that the M1 supports SHA512. - Using
-mcpu=neoverse-v1
or-mcpu=neoverse-n2
or other known Arm cores which support SHA512. - Various types of optimization options.
- Tried various (but not all) suggestions from
gcc --target-help
.
Is this a known error? I did not find any reference online about this error on Arm64 SHA512 intrinsics.
EDIT: SHA256 intrinsics also fail with the same error. Only SHA1 intrinsics work. My previous tests with SHA256 were made using clang, sorry.
EDIT 2: SHA256 intrinsics work with -march=armv8-a+sha2+crypto
but not with -march=armv8-a+sha2
. SHA512 still don't work, even with all -march
options.
答案1
得分: 1
SHA3/SHA512扩展仅在ARMv8.2-A及更高版本中由ARM记录。因此,gcc要求您使用-march=armv8.2-a+sha3
(或v8.3-a
等)。
英文:
The SHA3/SHA512 extension is documented by ARM only for ARMv8.2-A onward. As such, gcc requires you to use -march=armv8.2-a+sha3
(or v8.3-a
, etc.)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论