C – 指针偏移意外

huangapple go评论54阅读模式
英文:

C - Pointer offset unexpected

问题

I have a pointer to an array and I'm wanting to use functions like memcpy with a specific offset, however when offsetting the pointer address I'm getting a value that is more than the offset and I don't understand why. Can someone explain what is going on here?

#include <stdio.h>
#include <stdint.h>

int main()
{
    uint8_t *source[5];

    // Initial
    printf("%p\n", (void *)source); // 786796896

    // Offset by 2
    printf("%p\n", (void *)(source + 2)); // 786796912 (unexpected, more than 2)
}
英文:

I have a pointer to an array and I'm wanting to use functions like memcpy with a specific offset, however when offsetting the pointer address I'm getting a value that is more than the offset and I don't understand why. Can someone explain what is going on here?

#include &lt;stdio.h&gt;
#include &lt;stdint.h&gt;

int main()
{
    uint8_t *source[5];

    // Initial
    printf(&quot;%p\n&quot;, (void *)source); // 786796896

    // Offset by 2
    printf(&quot;%p\n&quot;, (void *)(source + 2)); // 786796912 (unexpected, more than 2)
}

答案1

得分: 0

问题在于当你将 2 加到 source 上时,数组会退化为指针类型 uint8_t **。当你对指针进行算术运算时,添加的偏移量是元素的数量,而不是字节的数量,如果指针元素的大小大于一个字节。从 source + 2source 的字节偏移实际上是 2*sizeof(*source) 字节,即 16 个字节。

为了绕过这种行为,将 source 转换为 char *,执行加法,然后再次转换。但请注意,如果执行不正确,可能会导致未对齐的访问,这可能不是好消息。

英文:

The issue here is when you add 2 to source, the array decays into a pointer type uint8_t **. When you perform arithmetic on a pointer, the offset added is the number of elements added and not the number of bytes if the size of the pointer element is bigger than a byte. The offset in bytes from source + 2 is actually 2*sizeof(*source) bytes, which is 16.

To bypass this behavior, cast source to a char *, perform the addition, then cast back. Be warned however, that doing that incorrectly can result in an unaligned access which can be bad news.

答案2

得分: 0

Pointer Arthimetics 需要尽量避免。对于上述代码,一个非常重要的注意点是,将 2 添加到 source 地址并不会将地址增加 +2,而是增加 +10,因为 2 被解释为 char* [5] 而不是 char*。

// 不使用类型转换
char * arr[5];
char * parr = malloc(sizeof(int));
    
printf("%p\t%p\n", arr, parr);
printf("%p\t%p\n", arr+2, parr+2);

0x7ffde2925fb0  0x55b519f252a0
 +10             +2
0x7ffde2925fc0  0x55b519f252a2
// 使用类型转换

char * arr[5];
char * parr = malloc(sizeof(int));
    
printf("%p\t%p\n", arr, parr);
printf("%p\t%p\n", (void*)arr+2, parr+2);

0x7ffde2925fb0  0x55b519f252a0
 +2               +2
0x7ffde2925fb2  0x55b519f252a2
英文:

Pointer Arthimetics need to be avoided as much as you can. For above

#include &lt;stdio.h&gt;
#include &lt;stdint.h&gt;

int main() {
    uint8_t* source[5]; // array of 5 pointers of uint8_t* type

    printf(&quot;%p\n&quot;, &amp;source[2]); // address of 3rd element place in array source
}

A very important point to note is it, adding 2 into source address not results in incrementing address by +2 but by +10 cause 2 was interpreted as char* [5] not char *.

// using without casting
char * arr[5];
char * parr = malloc(sizeof(int));
    
printf(&quot;%p\t%p\n&quot;, arr, parr);
printf(&quot;%p\t%p\n&quot;, arr+2, parr+2);

0x7ffde2925fb0  0x55b519f252a0
 +10             +2
0x7ffde2925fc0  0x55b519f252a2
//using with casting

char * arr[5];
char * parr = malloc(sizeof(int));
    
printf(&quot;%p\t%p\n&quot;, arr, parr);
printf(&quot;%p\t%p\n&quot;, (void*)arr+2, parr+2);

0x7ffde2925fb0  0x55b519f252a0
 +2               +2
0x7ffde2925fb2  0x55b519f252a2

答案3

得分: 0

> I have a pointer to an array

    uint8_t *source[5];

抱歉,但这不是指向数组的指针。这是一个包含5个未初始化的uint8_t指针的数组,所以你应该对它进行初始化。

这才是指向数组的指针:

    uint8_t (*source)[5];

[]操作符在类型表达式中的优先级高于*操作符)

> ...这里发生了什么?
嗯,基于你错误的假设,一切都预计会是不正确的。

    // 初始值
    printf("%p\n", (void *)source); // 786796896

    // 偏移2
    printf("%p\n", (void *)(source + 2)); // 786796912(出乎意料,大于2)

source + 2 是数组source的第三个元素的地址,它是一个指针数组,因此它在从source本身开始偏移了两个指针位置(在64位架构中是16字节,这就是你得到的值,而在32位架构中是8字节)。

只需使用提议的声明,你将得到你期望的结果。

$ a.out
0x560bee9df060
0x560bee9df06a
$ _

(10个位置,符合期望的两个5个uint8_t元素数组)

但你可以自行验证这一点。

如果你在source本身上使用sizeof运算符,你将得到一个数组的大小(5个指针,或64位架构中的40字节),而根据你在问题中的陈述,它应该是单个指针的大小(64位架构中的8字节)。如果你使用提议的声明进行声明,然后你将得到8,正如它应该得到的那样,而如果你执行sizeof *source(指向source指向的对象的大小),在这种情况下,你将得到5(五个uint8_t值的数组的大小)。试一下,看看吧。


关于评论中讨论的对齐问题的最后一点说明。

  1. 声明的对象是指针数组,而不是指向五个uint8_t数组的指针,因此传递给printf的值的地址是对齐的和有效的(它实际上是在堆栈中创建了变量source的地址),并且在使用该地址时不会引发未定义行为,因为数组内容本身未被使用,因此保持未初始化并且不会引发未定义行为。
  2. 如果正确声明为指向数组的指针,那么它的未初始化值将被打印出来——唯一不可预测的是值本身——但这不需要对指针进行解引用或对齐(这不是问题,请参见下文)。在这种情况下,应该避免对未初始化的指针变量进行指针运算,因为不能保证指针的目标值是有效地址,因此会引发某些不会对程序造成严重损害的未定义行为(这是因为原始未初始化值本身也可能是无效地址,并且在仅未初始化的情况下不会引发未定义行为。通常情况下,(void *)强制转换本身通常不会引发任何未定义行为)。
  3. 如果声明为指向数组的指针,那么就不会有对齐问题,因为指针的基本类型是uint8_t数组,其对齐方式为1(一个),因此任何地址都应该是可接受的。由于指针没有被解引用,因此它的行为类似于任何未初始化的变量,因此不会引发未定义行为。
英文:

> I have a pointer to an array

    uint8_t *source[5];

Sorry, but this is not a pointer to an array. It is an array of 5 pointers to uint8_t, uninitialized, so you should initialize it.

This is a pointer to an array:

    uint8_t (*source)[5];

(precedence of [] operator is higher than that of * in type expressions)

> ...what is going on here?

well, based on your wrong assumption, everything is expected to be incorrect.

    // Initial
    printf(&quot;%p\n&quot;, (void *)source); // 786796896

    // Offset by 2
    printf(&quot;%p\n&quot;, (void *)(source + 2)); // 786796912 (unexpected, more than 2)

source + 2 is the address of the third array element of array source which is an array of pointers, so it's two pointer places offset from source itself (this is 16 bytes in 64bit architectures --the value you get--, and 8 bytes in 32bit architectures)

Just try with the declaration proposed, and you will get what you expect.

$ a.out
0x560bee9df060
0x560bee9df06a
$ _

(10 positions, as expected for two 5 uint8_t element arrays)

But you can check this yourself.

If you use the sizeof operator on source itself, you will get the size of an array (5 pointers, or 40 bytes, on 64bit arch) while as you state in your question, it should be the size of a single pointer (8 bytes, in 64bit arch) If you declare it with the proposed declaration, then you will get 8 as it should be expetec, while if you do a sizeof *source (the size of the object pointed to by source), in this case you will get 5 (the size of an array of five uint8_t values) Try it and see it.


Final note about the alignment discussion handled in the comments to the question.

  1. The declared object was an array of pointers, and not a pointer to an array of five uint8_ts, so its address (the value passed to printf it's indeed aligned and valid (it's indeed the address in the stack where the variable source has been created), and no U.B. is incurred on using that address, because the array contents are not used themselves, so it remains uninitialized and provokes no U.B.)
  2. If declared correctly as an pointer to an array, its uninitialized value is printed ---the only unpredictable thing is the value itself---, but that doesn't require to dereference the pointer or alignment (which is not an issue, see next). Pointer arithmetic, in this case should be avoided for uninitialized pointer variable, as it is not warranted that the destination value of the pointer can be a valid address, and so there's some, not harmful normally, U.B. (this is because the original uninitialized value can also be an invalid address itself, and should not cause a U.B. while just being uninitialized. The (void *) cast, by itself, normally doesn't cause any U.B.)
  3. If declared as a pointer to an array, again, there's no alignment problem, as the base type of the pointer is an array of uint8_ts which has alignment of 1 (one) and so, any address should be acceptable. As the pointer is not dereferenced, it behaves as any uninitialized variable, and so, it doesn't cause U.B.

huangapple
  • 本文由 发表于 2023年5月17日 06:54:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76267570.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定