__m128i初始值和_mm_madd_epi16:结果是什么?

huangapple go评论73阅读模式
英文:

__m128i initializers and _mm_madd_epi16: What is the result?

问题

我尝试了以下代码:

__m128i x = {1,2,3,4,5,6,7,8};
__m128i y = {10,20,30,40,50,60,70,80};

__m128i z = _mm_madd_epi16(x, y);

结果是:
z = {6244, 201, -17692, 1006, 0,0,0,0}

但第一个元素应该是 1*10 + 2*20 = 50

你能解释一下我得到的结果吗?

英文:

I tried the following code:

__m128i x = { 1,2,3,4,5,6,7,8 };
__m128i y = { 10,20,30,40,50,60,70,80};

__m128i z = _mm_madd_epi16(x, y);

The result is:
z = {6244, 201, -17692, 1006, 0,0,0,0}

But the first element should be 1*10 + 2*20 = 50.

Can you please explain the result I got ?

答案1

得分: 3

问题出在初始值。

__m128i x = {1, 2, 3, 4, 5, 6, 7, 8};

这些 __m128i 的初始值设定器并不是你想象的那样。首先,它们在大多数编译器上甚至不能编译通过,除了 MSVC。在 MSVC 的情况下,这里发生的情况等价于:

__m128i x = _mm_setr_epi8(1, 2, 3, 4, 5, 6, 7, 8, 0, 0, 0, 0, 0, 0, 0, 0);

这不是你想要的。

修复很简单:使用适当的 set 内置函数,这里是 _mm_setr_epi16 用于 16 位元素。(或者如果你想在最左边放置最高位的话,可以用 _mm_set_epi16。)

请记住,用于结构体/联合体/数组的 C 初始化列表可以具有较少的元素,其余的元素会被隐式置为零。因此,显式元素的数量不能暗示你指的是哪种元素宽度。内置函数 API 使用 _mm_set 内置函数而不是裸初始化列表,因为相同类型可以容纳不同数量的元素。

你可以使用调试器检查 __m128i 的元素。

英文:

The problem is in the initial values.

__m128i x = { 1,2,3,4,5,6,7,8 };

Those __m128i initializers don't do what you think they do.
To begin with they do not even compile on most compilers, except MSVC. In the case of MSVC, what happened here is equivalent to:

__m128i x = _mm_setr_epi8( 1,2,3,4,5,6,7,8, 0,0,0,0,0,0,0,0 );

Which isn't what you meant.

The fix is simple: use the proper set intrinsic, in this case _mm_setr_epi16 for 16-bit elements. (Or _mm_set_epi16 if you want to give the highest one on the left, on the side a left-shift would shift towards.)

Remember that C initializer lists for structs / unions / arrays can have fewer elements with the rest being implicit zeros. So the number of explicit elements can't imply which element width you meant. The intrinsics API uses _mm_set intrinsics instead of bare initializer lists because the same type can hold different numbers of elements.

You can check the elements of a __m128i with a debugger.

huangapple
  • 本文由 发表于 2023年6月1日 06:12:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76377614.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定