英文:
typecast float32 to int16 using arm neon intrinsics
问题
我是一个初学者,对于 ARM NEON Intrinsics 不太熟悉,我想要将一个 float32 数组按照一个标量(2^13 = 8192)进行缩放,并将其转换为 int16_t 数组。
我认为我需要执行以下步骤:
- 加载 float 缓冲数组
- 与标量相乘(2^13 = 8192)
- 将它们转换为 32 位整数
- 将 32 位转换为 16 位整数
- 存储到 16 位缓冲区中
请帮我检查并纠正以下代码:
// 将 float 转换为 int16
uint32_t blk_cnt;
float32x4_t f32x4;
int32x4_t i32x4;
int16x4_t i16x4;
float32_t scale = 8192.0;
/* 一次计算 4 个复杂样本 */
blk_cnt = sz >> 2U;
while (blk_cnt > 0U) {
f32x4 = vld1q_f32 ((float32_t *) inpout);
f32x4 = vmulq_n_f32(f32x4, scale);
i32x4 = vcvtq_s32_f32 (f32x4);
i16x4 = vmovn_s32 (i32x4);
vst1_s16 (out, i16x4);
/* 增加指针 */
out += 4;
inpout += 4;
/* 减少循环计数器 */
blk_cnt--;
}
请注意,此代码中的注释已被翻译为中文。
英文:
I'm a newbie to arm neon intrinsics and I would like to scale the float32 array with a scalar (2^13 = 8192) and typecast to int16_t array.
I believe I need to perform the below steps:
- Load the float buffer array
- Multiply with the scalar (2^13 = 8192)
- Convert them to 32-bit integers
- Convert 32-bit to 16-bit integers
- Store them into 16-bit buffer
Could you please check and correct the below code:
// convert float to int16
uint32_t blk_cnt;
float32x4_t f32x4;
int32x4_t i32x4;
int16x4_t i16x4;
float32_t scale = 8192.0;
/* Compute 4 complex samples at a time */
blk_cnt = sz >> 2U;
while (blk_cnt > 0U) {
f32x4 = vld1q_f32 ((float32_t *) inpout);
f32x4 = vmulq_n_f32(f32x4, scale);
i32x4 = vcvtq_s32_f32 (f32x4);
i16x4 = vmovn_s32 (i32x4);
vst1_s16 (out, i16x4);
/* Increment pointers */
out += 4;
inpout += 4;
/* Decrement the loop counter */
blk_cnt--;
}
答案1
得分: 1
你很可能正在处理q13(13个小数位)的定点值。
只需将浮点数转换为q13 int32(vcvt_n_s32_f32
),然后通过vqmovn
缩小为int16。
链接:https://developer.arm.com/documentation/dui0473/m/vfp-instructions/vcvt--between-floating-point-and-fixed-point-
英文:
You are most probably dealing with q13 (13 fraction bits) fixed point values.
You just need to convert the floats to q13 int32 (vcvt_n_s32_f32
), then shrink them to int16 by vqmovn
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论