在C语言中理解半动态数组的声明

huangapple go评论66阅读模式
英文:

Understanding declaration of semi-dynamic array in C

问题

我们研究了2D数组和右左规则以理解复杂的声明。

我有这个半动态数组的声明:

int * array[2];

根据右左规则,我将其理解为:array 是一个包含2个指向int的指针的数组。因此,当我绘制图表时,我得到了这个:
在C语言中理解半动态数组的声明

但在课堂上,他们向我们展示了以下图表:

在C语言中理解半动态数组的声明

我错在哪里?在涉及到2D数组的声明时,我如何不再犯错误?

英文:

We studied 2D arrays and right-left rule to understand complex declaration.

I have this declaration of semi-dynamic array:

int * array[2];

With the right-left rule I understand it as: array is an array of 2 pointers to int. And therefore when I draw a diagram I get this:
在C语言中理解半动态数组的声明

But in class they showed us the following diagram:

在C语言中理解半动态数组的声明

Where I'm wrong? How can I not be mistaken again when it comes to declaration of 2D arrays?

答案1

得分: 3

除了 @chqrlie 的回答外,请注意 "半动态" 的第二张图片相当令人困惑,而您顶部的图片在某种程度上更正确(无论如何都不是错误的) - 只需注意您需要执行 array[n] 来访问指针,但要首先获取正确的指针,然后解引用指针并访问实际项,即 *(array[n])*(array[n]) 等同于 array[n][0]

int* array[2] 是一个包含2个指针的数组。这个数组在堆栈上分配。每个指针可以分配为指向单个 int 项。如果您愿意,这个 int 可以在堆上分配。它可以是一个单独的项,也可以是 int 数组中的第一个 int,在这种情况下,您需要以某种方式跟踪这些数组的大小。

因此,在 int* array[2] 的情况下,array[0][1] 很可能是越界访问。或者至少访问该项与访问 array[0][42] 一样正确或不正确 - 这取决于上下文。

"半动态" 的图片更正确地说明了一些奇怪的类型,比如 int (*arr[2])[2]; - "一个指向大小为2的int数组的指针数组"。对于大多数目的来说,这不是一个非常有用的类型,而且很难阅读。

总的来说,我们应该始终尽量在可能的情况下使用真正的二维数组,甚至在堆上。唯一需要使用指针数组或指向指针数组的指针的情况是当您有一个专门的容器,它实际上不是多维数组,而是具有各自维度大小的内容(有时被称为 "不规则数组")。一个例子是在堆上分配的字符串数组 - 使用 char**malloc 来实现这一点是完全有道理的。

指针数组/指针查找表等的缺点是,我们得到了分段分配,这比真正的多维数组在访问和分配方面要慢得多。而且在某些情况下过于复杂。请查看 https://stackoverflow.com/questions/42094465/correctly-allocating-multi-dimensional-arrays 了解详情。

英文:

In addition to @chqrlie's answer, please note that the second picture of "semi-dynamic" is rather confusing and your picture at the top is arguably more correct (and in any case not wrong) - just note that you'd have to do array[n] to access the pointer but *(array[n]) to first get the correct pointer, then de-reference the pointer and access the actual item. *(array[n]) being equivalent to array[n][0].

int* array[2] is an array of 2 pointers. This array is allocated on the stack. Each of the pointers may be assigned to point at a single int item. This int may be allocated on the heap if you like. It could be a single item or it could be the first int in an array of int, in which case you would need to keep track of those array sizes somehow.

So in case of int* array[2] then array[0][1] may very well be an out-of-bounds access. Or at least it is not any more or less correct to access that item than say array[0][42] - it depends on context.

The picture of "semi dynamic" more correctly illustrates some strange abomination type such as

int (*arr[2])[2]; - "An array of pointers to arrays of int with size 2". Not a very helpful type for most purposes and kind of ridiculously hard to read.

In general we should always strive to use true 2D arrays whenever possible, even on the heap. The only time you would use an array of pointers or a pointer to an array of pointers, is when you have a specialized container which isn't really a multi-dimensional array, but something with individual dimension sizes (sometimes called "jagged array"). One example is an array of strings allocated on the heap - it makes perfect sense to use char** and malloc for implementing that.

The drawback of arrays of pointers/pointer look-up tables etc is that we get segmented allocation, which is much slower to access and allocate than real multi-dimensional arrays. And also needlessly complex in certain situations. Check out https://stackoverflow.com/questions/42094465/correctly-allocating-multi-dimensional-arrays for details.

答案2

得分: 2

你的理解是正确的,并且对应于在栈上定义的指向int的指针数组(自动存储),在类图中被描述为半动态数组。

在课堂上显示的图表描述了另外2种具有不同布局的数组类型:

  • 一个真正的2D数组,即:2个int数组的数组。对于这个数组不需要进一步的分配,两个维度在编译时固定。
  • 一个完全动态数组,其中你将array定义为指向int指针的指针,int **array;。要使用这种类型的存储,你需要分配指针数组和每个指针指向的数组。两个维度可以在运行时确定,但访问稍微昂贵一些,因为需要两次间接访问int值。根据实际的实现,编译器可能能够优化多次连续的访问,消除一些冗余代码。

请注意,上述对象具有不同的布局,虽然你可以使用相同的语法array[row][col]来访问元素,但你不能将这些数组作为参数传递给相同的函数。

英文:

Your understanding is correct, and does correspond to an array of pointers to int defined with automatic storage (on the stack), described in the class diagram as a semi-dynamic array.

The diagram shown in class describes 2 other types of arrays with different layouts:

  • a true 2D array, ie: an array of 2 arrays of 2 int. No further allocation is needed for this one and both dimensions are fixed at compile time.
  • a fully dynamic array, where you define array as a pointer to pointers to ints, int **array;. To use this type of storage, you need to allocate the array of pointers and the arrays each pointer points to. Both dimensions can be determined at runtime, but access is somewhat more expensive as 2 indirections are required to access the int values. Depending on the actual impementations, the compilers may be able to optimize several consecutive accesses, factoring out some redundant code.

Note that the above objects have a different layout and while you can access the elements using the same syntax array[row][col] for both reading and writing, you cannot pass these arrays as arguments to the same functions.

答案3

得分: 1

你被一些不寻常的术语误导了。一个2D数组始终被声明为

T arr[M][N];  // 对于某个任意类型T和大小表达式M和N
              // M和N

这个声明可以解读为“arr是一个M元素数组,每个元素都是N元素的T数组”。

你的“半动态”数组是一个1D数组,指向指针

int *arr[M];

每个元素可能指向另一个数组的第一个元素(无论是动态分配的还是不是)

for ( size_t i = 0; i < M; i++ )
  arr[i] = malloc( sizeof *arr[i] * N ); // 为N个整数分配空间

而你的“完全动态”数组

int **arr;

根本不是一个数组。它只是一个指针。它可能指向指针数组的第一个元素(动态分配的或不是),其中的每个元素可能指向数组的第一个元素(动态分配的或不是):

arr = malloc( sizeof *arr * M ); // 为M个指向int的指针分配空间
for ( size_t i = 0; i < M; i++ )
  arr[i] = malloc( sizeof *arr[i] * N );  // 为N个整数分配空间。

假设MN都是2,你的2D数组将布局如下

     +---+
arr: |   | arr[0][0]
     +---+
     |   | arr[0][1]
     +---+
     |   | arr[1][0]
     +---+
     |   | arr[1][1]
     +---+

而你的1D指针数组将布局如下

     +---+                                +---+
arr: |   | arr[0] ----------------------> |   | arr[0][0]
     +---+                                +---+
     |   | arr[1] -----------+            |   | arr[0][1]
     +---+                   |            +---+
                             |  
                             |            +---+
                             +----------> |   | arr[1][0]
                                          +---+
                                          |   | arr[1][1]
                                          +---+

而你的指向指针的版本将如下所示

     +---+     +---+                       +---+
arr: |   | --> |   | arr[0] -------------> |   | arr[0][0]
     +---+     +---+                       +---+
               |   | arr[1] -------+       |   | arr[0][1]
               +---+               |       +---+
                                   |
                                   |       +---+
                                   +-----> |   | arr[1][0]
                                           +---+
                                           |   | arr[1][1]
                                           +---+

再次强调,你的指针不一定要指向动态分配的内存。以下也是有效的(尽管我目前没有一个很好的用例):

int x[2];
int y[4];

int *arr[2] = {x, y};

这将给我们带来以下布局

     +---+                           +---+ 
arr: |   | arr[0] --------------> x: |   | x[0] (arr[0][0])
     +---+                           +---+
     |   | arr[1] -------+           |   | x[1] (arr[0][1])
     +---+               |           +---+
                         |
                         |           +---+
                         +-------> y: |   | y[0] (arr[1][0])
                                     +---+
                                     |   | y[1] (arr[1][1])
                                     +---+
                                     |   | y[2] (arr[1][2])
                                     +---+
                                     |   | y[3] (arr[1][3])
                                     +---+

xy不是动态分配的;它们与arr具有相同的存储类别。

英文:

You're being led astray by some unconventional terminology. A 2D array is always declared as

T arr[M][N];  // for some arbitrary type T and size expressions
              // M and N

and the declaration can be read as "arr is an M-element array of N-element arrays of T".

Your "semi-dynamic" array is a 1D array of pointers:

int *arr[M];

each element may point to the first element of another array (whether dynamically allocated or not)

for ( size_t i = 0; i < M; i++ )
  arr[i] = malloc( sizeof *arr[i] * N ); // allocates space for N ints

And your "fully dynamic" array

int **arr;

isn't an array at all. It's just a pointer. It may point to the first element of an array of pointers (dynamically allocated or not), each element of which may point to the first element of an array (dynamically allocated or not):

arr = malloc( sizeof *arr * M ); // Allocates space for M pointers to int
for ( size_t i = 0; i < M; i++ )
  arr[i] = malloc( sizeof *arr[i] * N );  // Allocates space for N ints.  

Assuming M and N are 2, your 2D array will be laid out as

     +---+
arr: |   | arr[0][0]
     +---+
     |   | arr[0][1]
     +---+
     |   | arr[1][0]
     +---+
     |   | arr[1][1]
     +---+

whereas your 1D array of pointers will be laid out as

     +---+                                +---+
arr: |   | arr[0] ----------------------> |   | arr[0][0]
     +---+                                +---+
     |   | arr[1] -----------+            |   | arr[0][1]
     +---+                   |            +---+
                             |  
                             |            +---+
                             +----------> |   | arr[1][0]
                                          +---+
                                          |   | arr[1][1]
                                          +---+

and your pointer-to-pointer version will look like

     +---+     +---+                       +---+
arr: |   | --> |   | arr[0] -------------> |   | arr[0][0]
     +---+     +---+                       +---+
               |   | arr[1] -------+       |   | arr[0][1]
               +---+               |       +---+
                                   |
                                   |       +---+
                                   +-----> |   | arr[1][0]
                                           +---+
                                           |   | arr[1][1]
                                           +---+

Again, there's no rule that your pointers have to point to dynamically-allocated memory. The following is also valid (although I don't have a good use case for it at the moment):

int x[2];
int y[4];

int *arr[2] = {x, y};

which gives us

     +---+                           +---+ 
arr: |   | arr[0] --------------> x: |   | x[0] (arr[0][0])
     +---+                           +---+
     |   | arr[1] -------+           |   | x[1] (arr[0][1])
     +---+               |           +---+
                         |
                         |           +---+
                         +------> y: |   | y[0] (arr[1][0])
                                     +---+
                                     |   | y[1] (arr[1][1])
                                     +---+
                                     |   | y[2] (arr[1][2])
                                     +---+
                                     |   | y[3] (arr[1][3])
                                     +---+

x and y are not dynamically allocated; they're the same storage class as arr.

huangapple
  • 本文由 发表于 2023年6月26日 15:47:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76554573.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定