英文:
Why does my C function return string variable as a pointer instead of a value?
问题
In C中,我有以下函数:
// 获取可读性:Coleman-Liau指数公式:((0.0588 * 平均每100个单词的字母数) - (0.296 * 平均每100个单词的句子数) - 15.8)
string get_readability(float avg_letters, float avg_sentences)
{
// 将结果存储为int,以四舍五入得到的指数数值
int index = (0.0588 * avg_letters) - (0.296 * avg_sentences) - 15.8;
// 返回年级水平
if (index < 1)
{
return "Grade 1之前";
}
else if (index > 16)
{
return "Grade 16+";
}
else
{
// 为了连接指数数字,创建char数组作为结果
char result[8] = "Grade ";
char grade[2];
sprintf(grade, "%i", index);//将int转换为字符串
strcat(result, grade);
// 将char数组转换为字符串,以便返回结果
string result_str = result;
return result_str;
}
}
当我调用它并将其存储在变量中时,它是正确的,我可以立即打印它,没有问题:
// 获取阅读水平
//string reading_level = get_readability(letter_avg, sentence_avg);
string reading_level = get_readability(464.29f, 28.57f);
//printf("%i letters\n%i words\n%i sentences\n%f letter avg\n%f sentence avg\n", letter_count, word_count, sentence_count, letter_avg, sentence_avg);//debug
printf("%s\n", reading_level);
但是,如果我在设置变量和打印变量之间进行多个printf语句,那么变量将变为"",并且只打印一个空白行。
// 获取阅读水平
//string reading_level = get_readability(letter_avg, sentence_avg);
string reading_level = get_readability(464.29f, 28.57f);
//printf("%i letters\n%i words\n%i sentences\n%f letter avg\n%f sentence avg\n", letter_count, word_count, sentence_count, letter_avg, sentence_avg);//debug
printf("\n");
printf("\n");
printf("\n");
printf("%s\n", reading_level);
根据我的了解,这个问题似乎是函数返回result_str
时返回了该变量的指针,并且在多次调用printf后,它被覆盖掉。但我不太理解为什么会发生这种情况,因为我没有传递任何指针。
英文:
In C, I have the following function:
//Get Readability: Coleman-Liau Index Formula: ((0.0588 * Average # of letters per 100 words) - (0.296 * Average # of sentences per 100 words) - 15.8)
string get_readability(float avg_letters, float avg_sentences)
{
//Store as int so that it rounds the resulting index number
int index = (0.0588 * avg_letters) - (0.296 * avg_sentences) - 15.8;
//printf("Index: %i\n", index);//debug
//Return the grade level
if (index < 1)
{
return "Before Grade 1";
}
else if (index > 16)
{
return "Grade 16+";
}
else
{
//Create char array for result in order to concat the index number
char result[8] = "Grade ";
char grade[2];
sprintf(grade, "%i", index);//Convert int to string
strcat(result, grade);
//Convert char array to string in order to return results
string result_str = result;
return result_str;
}
}
When I call it and store it in a variable it's correct and I can print it right away with no issues:
//Get reading level
//string reading_level = get_readability(letter_avg, sentence_avg);
string reading_level = get_readability(464.29f, 28.57f);
//printf("%i letters\n%i words\n%i sentences\n%f letter avg\n%f sentence avg\n", letter_count, word_count, sentence_count, letter_avg, sentence_avg);//debug
printf("%s\n", reading_level);
But if I do multiple printf statements between setting the variable and printing it, then the variable changes to "" and just prints a blank line.
//Get reading level
//string reading_level = get_readability(letter_avg, sentence_avg);
string reading_level = get_readability(464.29f, 28.57f);
//printf("%i letters\n%i words\n%i sentences\n%f letter avg\n%f sentence avg\n", letter_count, word_count, sentence_count, letter_avg, sentence_avg);//debug
printf("\n");
printf("\n");
printf("\n");
printf("%s\n", reading_level);
Looking this up it seems like the issue might be that when the function is returning result_str
it's returning the pointer to that variable and after calling printf a couple times, it's getting written over. But I don't really understand how that would happen as I'm not passing anything with *.
答案1
得分: 1
以下是翻译的部分:
这是你的问题:
char result[8] = "Grade ";
...
string result_str = result;
return result_str;
首先,`string` 类型是 CS50 独有的;它不是标准 C 的一部分。C 没有实际的 *string* 数据类型。
其次,CS50 的 `string` 类型也不是实际的 *string* 数据类型,它只是 *指针* 类型的别名:
typedef char *string;
在 C 中,字符串简单地是字符值序列,包括一个值为零的终止符 - 字符串 `"hello"` 表示为序列 `{ 'h', 'e', 'l', 'l', 'o', 0 }`。字符串存储在字符类型的数组中,但字符数组也可以存储 *不是* 字符串的序列(要么没有零值终止符,要么有多个零值字节)。
除非它是 `sizeof`、`_Alignof` 或一元 `&` 操作符的操作数,或者是用于初始化声明中的字符数组的字符串字面量,否则类型为 "N 元素数组的 `T`" 的 *表达式* 将被转换或“衰减”为类型为 "指向 `T` 的指针" 的表达式,而表达式的值将是数组的第一个元素的地址。当你将数组表达式作为函数的参数传递时,比如:
char arr[N];
foo( arr );
上面的转换规则会将 `arr` 替换为一个指针表达式,使函数调用看起来更像是:
foo( &arr[0] );
函数接收到的是一个指针,而不是数组。这对所有数组类型都适用,不仅仅是字符数组。
因此,当你写下:
string result_str = result;
你实际上是将 `result` 数组的第一个元素的 *指针* 存储到 `result_str`,而不是它的内容,并通过扩展返回给调用者。
第三,`result` 数组是局部于 `get_readability` 的,一旦函数退出,它就*不存在*了。数组占用的内存仍然在那里,但现在可以供其他函数使用。只要在 `get_readability` 之后没有调用其他函数,那个内存就保持不变,你的代码看起来是按预期工作的,但一旦调用了 `printf("\n");`,那个内存就被覆盖了。
这里有两种方法。
第一种(也是我个人首选的)方法是将目标缓冲区作为参数传递给 `get_readability`(将所有的 `string` 实例替换为 `char *`,因为 `string` 在 CS50 环境之外无法使用):
```c
/**
* 仅为方便起见,我们返回指向目标缓冲区的指针。
*/
char *get_readability(float avg_letters, float avg_sentences, char *result, size_t result_size)
{
...
/**
* 使用 snprintf 确保我们不会写超出 result 数组的末尾。
*/
snprintf(result, result_size, "Grade %d", index);
return result;
}
并且你可以这样调用它:
char result[8];
...
printf("reading level: %s\n", get_readability(464.29f, 28.57f, result, sizeof result));
再次强调,我们返回 result
指针只是为了方便;你也可以这样做:
char result[8];
get_readability(464.29f, 28.57f, result, sizeof result);
printf("Reading level: %s\n", result);
第二种(不太推荐的)选项是在 get_readability
中动态分配内存并返回指向该内存的指针:
char *get_readability(float avg_letters, float avg_sentences)
{
...
char *result = calloc(8, sizeof *result);
if (result)
snprintf(result, 8, "Grade %d", index);
else
// 内存分配失败,根据需要进行处理
return result;
}
然后这样调用它:
char *result = get_readability(464.29f, 28.57f);
printf("Reading level: %s\n", result);
...
free(result);
这种方法的缺点有:
- 当你用完它时,必须显式释放分配的内存,使用
free
函数; - 为了能够这样做,你必须将返回的指针值保存到某个变量中 - 如果你写像
printf("Reading level: %s\n", get_readability(464.29f, 28.57f));
这样的代码,那么你将失去对那个动态分配的缓冲区的任何引用,它会一直存在,直到程序退出。
你可以做一些像 char *result; printf("Reading level: %s\n", (result = get_readability(464.29f, 28.57f)); ... free(result);
这样的事情,但它不会给你带来太多好处,而且维护你的代码的任何人可能会抓狂。
英文:
Here's your problem:
char result[8] = "Grade ";
...
string result_str = result;
return result_str;
First, the string
type is unique to CS50; it's not part of standard C. C does not have an actual string data type.
Second, the CS50 string
type is not an actual string data type, either - it's a typedef name (alias) for a pointer type:
typedef char *string;
A string in C is simply a sequence of character values including a zero-valued terminator - the string "hello"
is represented as the sequence {'h', 'e', 'l', 'l', 'o', 0}
. Strings are stored in arrays of character type, but character arrays can also store sequences that are not strings (either don't have a 0-valued terminator, or have multiple 0-valued bytes).
Unless it is the operand of the sizeof
, _Alignof
, or unary &
operators, or is a string literal used to initialize a character array in a declaration, an expression of type "N-element array of T
" will be converted, or "decay", to an expression of type "pointer to T
" and the value of the expression will be the address of the first element of the array. When you pass an array expression as an argument to a function, like
char arr[N];
foo( arr );
the conversion rule above replaces arr
with a pointer expression, such that the function call looks more like
foo( &arr[0] );
and what the function receives is a pointer, not an array. This is true for all array types, not just character arrays.
So when you write
string result_str = result;
you're storing a pointer to the first element of the result
array to result_str
, not its contents, and by extension returning that address to the caller.
Third, the result
array is local to get_readability
and ceases to exist once the function exits. The memory that the array occupied is still there, but now it's available for other functions to use. As long as you didn't call any other functions after get_readability
that memory remained unmodified and your code appeared to work as expected, but as soon as you called printf( "\n" );
that memory was overwritten.
There are two ways to go here.
The first (and IMO preferred) method is to pass the target buffer as a parameter to get_readability
(replacing any instance of string
with char *
since string
won't work outside of the CS50 environment):
/**
* We're returning a pointer to the target buffer just
* for convenience's sake.
*/
char *get_readability( float avg_letters, float avg_sentences, char *result, size_t result_size )
{
...
/**
* Using snprintf to make sure we don't write past the
* end of the result array.
*/
snprintf( result, result_size, "Grade %d", index );
return result;
}
And you'd call it as
char result[8];
...
printf( "reading level: %s\n", get_readability( 464.29f, 28.57f, result, sizeof result ) );
Again, we're returning the result
pointer just for convenience; you could also do:
char result[8];
get_readability( 464.29f, 28.57f, result, sizeof result );
printf( "Reading level: %s\n", result );
The second (less-preferred) option is to dynamically allocate memory within get_readability
and return a pointer to that:
char *get_readability( float avg_letters, float avg_sentences )
{
...
char *result = calloc( 8, sizeof *result );
if ( result )
snprintf( result, 8, "Grade %d", index );
else
// memory allocation failure, handle as appropriate
return result;
}
and call it as
char *result = get_readablity( 464.29f, 28.57f );
printf( "Reading level: %s\n", result );
...
free( result );
Drawbacks to this method:
<ul>
<li>When you're done with it, you must explicitly free the allocated memory using the <code>free</code> library function;
<li>To be able to do that, you must save the returned pointer value to a variable somewhere - if you write something like <pre><code>printf( "Reading level: %s\n", get_readability( 464.29f, 28.57f ) );</code></pre> then you've lost any reference to that dynamically-allocated buffer and it hangs around until the program exits.
<br><br>
You <em>could</em> do something like <pre><code>char *result;
printf( "Reading level: %s\n", (result = get_readability( 464.29f, 28.57f ) );
...
free( result );
</code></pre> but it doesn't buy you that much and you'd likely get murdered by anyone maintaining your code.
</ul>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论