英文:
C code : Warning when converting strings to uppercase
问题
I'm programming a STM32 device in C using STM32CubeIDE.
我正在使用STM32CubeIDE用C语言编程一个STM32设备。
I want to convert lower case characters in a string to upper, leaving the string in its current place. I "stole" the code below from online, however I get a warning...operation on '*String' may be undefined. The function works ok, how do I modify this to get rid of the warning.
我想将字符串中的小写字符转换为大写,保留字符串在原地。我从网上"借鉴"了下面的代码,但是我得到了一个警告...对'*String'的操作可能是未定义的。这个函数正常工作,如何修改以消除警告。
My code is
我的代码如下:
void StrToUpperCase(char *String)
{
while (*String)
{
*String = (*String >= 'a' && *String <= 'z') ? *String - 0x20 : *String;
String++;
}
}
<details>
<summary>英文:</summary>
I'm programming a STM32 device in C using SIM32CudeIDE.
I want to convert lower case characters in a string to upper, leaving the string in its current place. I "stole" the code below from online, however I get a warning...operation on '*String' may be undefined. The function works ok, how do I modify this to get rid of the warning.
My code is
void StrToUpperCase(char *String)
{
while (*String)
{
*String = (*String >= 'a' && *String <= 'z') ? *String = *String - 0x20 : *String;
String++;
}
}
</details>
# 答案1
**得分**: 3
The behavior of `*String = (*String >= 'a' && *String <= 'z') ? *String = *String - 0x20 : *String;` is not defined by the C standard because it contains two assignments to `*String` for which the updates of `*String` are not sequenced, violating C 2018 6.5 2:
> If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined…
To fix this, the unnecessary interior assignment should be removed:
*String = (*String >= 'a' && *String <= 'z') ? *String - 0x20 : *String;
There remain other issues with the code, notably that `>= 'a'`, `<= 'z'`, and `0x20` are dependent on a particular character set encoding. At the very least, assuming there is some reason for not using the standard `toupper` function and that we can assume the uppercase and the lowercase letters are each contiguous and in the same order, the code is better written as:
if (*String >= 'a' && *String <= 'z')
*String += 'A' - 'a';
<details>
<summary>英文:</summary>
The behavior of `*String = (*String >= 'a' && *String <= 'z') ? *String = *String - 0x20 : *String;` is not defined by the C standard because it contains two assignments to `*String` for which the updates of `*String` are not sequenced, violating C 2018 6.5 2:
> If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined…
To fix this, the unnecessary interior assignment should be removed:
*String = (*String >= 'a' && *String <= 'z') ? *String - 0x20 : *String;
There remain other issues with the code, notably that `>= 'a'`, `<= 'z'`, and `0x20` are dependent on a particular character set encoding. At the very least, assuming there is some reason for not using the standard `toupper` function and that we can assume the uppercase and the lowercase letters are each contiguous and in the same order, the code is better written as:
if (*String >= 'a' && *String <= 'z')
*String += 'A' - 'a';
</details>
# 答案2
**得分**: 1
Do not call `StrToUpperCase()` with a _string literal_ like `StrToUpperCase("Hello");` as attempting to change a string literal is _undefined behavior_ (UB). [@CoffeeTableEspresso](https://stackoverflow.com/questions/76006676/c-code-warning-when-converting-strings-to-uppercase/76006991#comment134052889_76006676).
---
The 2 assignments in code are strange:
// v v
*String = (*String >= 'a' && *String <= 'z') ? *String = *String - 0x20 : *String;
1 assignment would make more sense and may squash the warning:
*String = (*String >= 'a' && *String <= 'z') ? *String - 0x20 : *String;
---
Better to upper code (reduce magic numbers):
*String = (*String >= 'a' && *String <= 'z') ? *String - 'a' + 'A' : *String;
---
Even better to upper code:
```c
#include <ctype.h>
char *StrToUpperCase(char *str) {
// toupper() designed for unsigned char values (and EOF)
unsigned char *ustr = (unsigned char *) str;
while (*ustr) {
*ustr = toupper(*ustr);
ustr++;
}
return str;
}
英文:
Do not call StrToUpperCase()
with a string literal like StrToUpperCase("Hello");
as attempting to change a string literal is undefined behavior (UB). @CoffeeTableEspresso.
The 2 assignments in code is strange:
// v v
*String = (*String >= 'a' && *String <= 'z') ? *String = *String - 0x20 : *String;
1 assignment would make more sense and may squash the warning:
*String = (*String >= 'a' && *String <= 'z') ? *String - 0x20 : *String;
Better to upper code (reduce magic numbers):
*String = (*String >= 'a' && *String <= 'z') ? *String - 'a' + 'A' : *String;
Even better to upper code:
#include <ctype.h>
char *StrToUpperCase(char *str) {
// toupper() designed for unsigned char values (and EOF)
unsigned char *ustr = (unsigned char *) str;
while (*ustr) {
*ustr = toupper(*ustr);
ustr++;
}
return str;
}
答案3
得分: 1
The side effect of the left-most *String =
is unsequenced in relation to another side effect on the same variable, namely *String = *String - 0x20
. So your code invokes undefined behavior.
左边最左边的 *String =
的副作用与同一变量上的另一个副作用,即 *String = *String - 0x20
,它们之间没有顺序,因此您的代码会引发未定义的行为。
The ?:
operator works as first evaluating the first operand (to the left), then there is a sequence point between that one and the evaluation of the second or third operand.
?:
操作符首先评估第一个操作数(在左边),然后在该操作数与第二个或第三个操作数的评估之间存在一个序列点。
However, an expression *String = op1 ? op2 :op3;
is by operator precedence equivalent to *String = (op1 ? op2 : op3);
and if either op2
or op3
happen to modify *String
as well, it is undefined behavior since there's no sequence point between op2
/op3
and *String =
.
然而,表达式 *String = op1 ? op2 :op3;
根据操作符优先级等效于 *String = (op1 ? op2 : op3);
,如果 op2
或 op3
中的任何一个也修改了 *String
,那么它是未定义的行为,因为 op2
/op3
和 *String =
之间没有序列点。
Solution:
Contrary to popular belief, most operators on a single line does not win a price. Rather, clean readable code wins a price:
与普遍看法相反,大多数操作符放在一行上并不会获得奖励。相反,清晰易读的代码才会获得奖励:
for(size_t i=0; String[i] != 'for(size_t i=0; String[i] != '\0'; i++)
{
if(String[i] >= 'a' && String[i] <= 'z')
{
String[i] -= 0x20;
}
}
'; i++)
{
if(String[i] >= 'a' && String[i] <= 'z')
{
String[i] -= 0x20;
}
}
This is still problematic because arithmetic on characters in the character table isn't well-defined for any other character but '0' to '9'. Rewriting the code again, we can fix that too:
这仍然存在问题,因为在字符表中对字符进行算术运算只对 '0' 到 '9' 之外的字符没有明确定义。再次重写代码,我们也可以修复这个问题:
#include <ctype.h>
for(size_t i=0; String[i] != '#include <ctype.h>
for(size_t i=0; String[i] != '\0'; i++)
{
String[i] = toupper(String[i]);
}
'; i++)
{
String[i] = toupper(String[i]);
}
toupper
being well-defined to only change characters it recognizes as lower case (ironically, it very likely does this internally by masking away 0x20 but no guarantees).
toupper
被明确定义为只更改它识别为小写的字符(具有讽刺意味的是,它很可能在内部通过掩码方式减去 0x20 来执行此操作,但不提供保证)。
英文:
The side effect of the left-most *String =
is unsequenced in relation to another side effect on the same variable, namely *String = *String - 0x20
. So your code invokes undefined behavior.
The ?:
operator works as first evaluating the first operand (to the left), then there is a sequence point between that one and the evaluation of the second or third operand.
However, an expression *String = op1 ? op2 :op3;
is by operator precedence equivalent to *String = (op1 ? op2 : op3);
and if either op2
or op3
happen to modify *String
as well, it is undefined behavior since there's no sequence point between op2
/op3
and *String =
.
Solution:
Contrary to popular belief, most operators on a single line does not win a price. Rather, clean readable code wins a price:
for(size_t i=0; String[i] != 'for(size_t i=0; String[i] != '\0'; i++)
{
if(String[i] >= 'a' && String[i] <= 'z')
{
String[i] -= 0x20;
}
}
'; i++)
{
if(String[i] >= 'a' && String[i] <= 'z')
{
String[i] -= 0x20;
}
}
This is still problematic because arithmetic on characters in the character table isn't well-defined for any other character but '0'
to '9'
. Rewriting the code again, we can fix that too:
#include <ctype.h>
for(size_t i=0; String[i] != '#include <ctype.h>
for(size_t i=0; String[i] != '\0'; i++)
{
String[i] = toupper(String[i]);
}
'; i++)
{
String[i] = toupper(String[i]);
}
toupper
being well-defined to only change characters it recognizes as lower case (ironically, it very likely does this internally by masking away 0x20 but no guarantees).
答案4
得分: 1
以下是翻译好的代码部分:
这是一个例子,说明保存键盘并编写“hacky”代码从未是一个好决策。将其写成更多行,不要使用魔法数字,并返回该值以在其他函数调用中使用它。
char *StrToUpperCase(char *String)
{
char *saved = String;
if(String)
{
while (*String)
{
if(*String >= 'a' && *String <= 'z')
{
*String += 'A' - 'a';
}
String++;
}
}
return saved;
}
更容易阅读、调试和维护的代码示例如下:
https://godbolt.org/z/edrodYeYx
但更好的方法是使用标准库函数 toupper
。
英文:
This an example why saving the keyboard and writing "hacky" code is never a good decision. Write it in more lines, do not use magic numbers, and return the value to use it in another function calls.
char *StrToUpperCase(char *String)
{
char *saved = String;
if(String)
{
while (*String)
{
if(*String >= 'a' && *String <= 'z')
{
*String += 'A' - 'a';
}
String++;
}
}
return saved;
}
It is much easier to read, debug and maintain
https://godbolt.org/z/edrodYeYx
But much better is to use standard library function toupper
答案5
得分: 0
Exclusive Or
也可以正常工作
#include <stdio.h>
#include <stdlib.h>
char *Upper(char *str) {
unsigned char *p = (unsigned char *) str;
while (*p) {
if(*p>='a'&&*p<='z'){
*p=*p^32;
}
p++;
}
return str;
}
char *Lower(char *str) {
unsigned char *p = (unsigned char *) str;
while (*p) {
if(*p>='A'&&*p<='Z'){
*p=*p^32;
}
p++;
}
return str;
}
char *SwitchLowerUpper(char *str) {
unsigned char *p = (unsigned char *) str;
while (*p) {
if((*p>='A'&&*p<='Z') || (*p>='a'&&*p<='z')){
*p=*p^32;
}
p++;
}
return str;
}
int main() {
char test[]="aBcDef";
printf("%s\n",Upper(test));
printf("%s\n",Lower(test));
char test2[]="aBcDef";
printf("%s\n",SwitchLowerUpper(test2));
return 0;
}
英文:
Exclusive Or
work fine too
#include <stdio.h>
#include <stdlib.h>
char *Upper(char *str) {
unsigned char *p = (unsigned char *) str;
while (*p) {
if(*p>='a'&&*p<='z'){
*p=*p^32;
}
p++;
}
return str;
}
char *Lower(char *str) {
unsigned char *p = (unsigned char *) str;
while (*p) {
if(*p>='A'&&*p<='Z'){
*p=*p^32;
}
p++;
}
return str;
}
char *SwitchLowerUpper(char *str) {
unsigned char *p = (unsigned char *) str;
while (*p) {
if((*p>='A'&&*p<='Z') || (*p>='a'&&*p<='z')){
*p=*p^32;
}
p++;
}
return str;
}
int main() {
char test[]="aBcDef";
printf("%s\n",Upper(test));
printf("%s\n",Lower(test));
char test2[]="aBcDef";
printf("%s\n",SwitchLowerUpper(test2));
return 0;
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论