英文:
Can someone explain to me the performance difference of using pointers vs indices for accessing an array?
问题
Pointers (指针):
使用指针进行回文检查:
```c
bool isPalindrome(char *s) {
size_t length = strlen(s);
char *left = s;
char *right = s + length - 1;
while (left < right) {
while (left < right && !isalnum(*left))
++left;
if (left == right)
return true;
while (right > left && !isalnum(*right))
--right;
if (right == left)
return true;
if (tolower(*left) != tolower(*right))
return false;
++left;
--right;
}
return true;
}
Indices (索引):
使用索引进行回文检查:
```c
bool isPalindrome(char *s) {
size_t length = strlen(s);
size_t left = 0;
size_t right = length - 1;
while (left < right) {
while (left < right && !isalnum(s[left]))
++left;
if (s[left] == '使用索引进行回文检查:
```c
bool isPalindrome(char *s) {
size_t length = strlen(s);
size_t left = 0;
size_t right = length - 1;
while (left < right) {
while (left < right && !isalnum(s[left]))
++left;
if (s[left] == '\0')
return true;
while (right > left && !isalnum(s[right]))
--right;
if (right == 0)
return true;
if (tolower(s[left]) != tolower(s[right]))
return false;
++left;
--right;
}
return true;
}
')
return true;
while (right > left && !isalnum(s[right]))
--right;
if (right == 0)
return true;
if (tolower(s[left]) != tolower(s[right]))
return false;
++left;
--right;
}
return true;
}
英文:
So I am learning C and to get practice I do some "LeetCode" challenges. For a palindrome checker I have tried two approaches, one using indices and one using pointers.
- Where does the large performance difference come from?
And as a follow up question:
- What method is usually preferred and for what reasons?
For reference here are the code blocks.
Pointers:
bool isPalindrome(char *s) {
size_t length = strlen(s);
char *left = s;
char *right = s + length - 1;
while (left < right) {
while (left < right && !isalnum(*left))
++left;
if (left == right)
return true;
while (right > left && !isalnum(*right))
--right;
if (right == left)
return true;
if (tolower(*left) != tolower(*right))
return false;
++left;
--right;
}
return true;
}
Indices:
bool isPalindrome(char *s) {
size_t length = strlen(s);
size_t left = 0;
size_t right = length - 1;
while (left < right) {
while (left < right && !isalnum(s[left]))
++left;
if (s[left] == 'bool isPalindrome(char *s) {
size_t length = strlen(s);
size_t left = 0;
size_t right = length - 1;
while (left < right) {
while (left < right && !isalnum(s[left]))
++left;
if (s[left] == '\0')
return true;
while (right > left && !isalnum(s[right]))
--right;
if (right == 0)
return true;
if (tolower(s[left]) != tolower(s[right]))
return false;
++left;
--right;
}
return true;
}
')
return true;
while (right > left && !isalnum(s[right]))
--right;
if (right == 0)
return true;
if (tolower(s[left]) != tolower(s[right]))
return false;
++left;
--right;
}
return true;
}
答案1
得分: 3
以下是要翻译的内容:
Where does the large performance difference come from?
大部分性能差异可能来自哪里?但了解您如何编译代码、如何测量性能以及使用什么输入将会有所帮助。以下是一些提示:
-
所发布的这两个函数没有实现相同的语义:在指针版本中,您会在
if (left == right)
条件下退出循环,而在索引版本中的等效测试是if (s[left] == '\0')
,在大多数情况下if (left == right)
是不成立的。如果函数行为不同,那么它们性能不同也就不足为奇了。 -
是否启用了优化?没有启用优化,
s[left]
可能会生成比*left
更多的代码,但启用优化后,大多数编译器将生成性能相似的代码,要么将索引方法转换为指针方法,要么使用使用多个寄存器的寻址模式,有或无比例,没有性能惩罚。
What method is usually preferred and for what reasons?
通常情况下,两种方法如果正确实现的话都是等效的,并且在现代编译器下具有等效的性能。
以下是使用指针的原因:
- 如果本地规则鼓励使用指针,那么使用指针可能更合适。
- 指针在 C(和 C++)中相当特殊,这会使代码对那些已经有几十年 C 编程经验的程序员更加熟悉。
以下是使用带有类型 size_t
的索引变量的原因:
- 使用索引在大多数编程语言中更通用,在 C++ 中可能更符合惯例(对于某些情况)。
- 对于来自各种背景的程序员来说,代码更易读。
- 代码更容易移植到其他编程语言中。
在关注性能之前,请确保正确性!
- 这两个函数都在空字符串(一个平凡的回文串)上具有未定义行为。
- 您有多个冗余的测试。
- 来自
<ctype.h>
的宏isalnum()
和tolower()
仅对一定范围的int
值定义:所有char
类型的值以及特殊的负值EOF
展开。在char
类型默认为有符号的平台上,传递负的char
值具有未定义行为。您应该使用(unsigned char)
进行强制转换。
以下是修改后的版本:
bool isPalindrome_idx(const char *s) {
size_t len = strlen(s);
if (len > 1) {
size_t left = 0;
size_t right = len - 1;
while (left < right) {
while (!isalnum((unsigned char)s[left])) {
if (++left >= right)
return true;
}
while (!isalnum((unsigned char)s[right])) {
if (left >= --right)
return true;
}
if (tolower((unsigned char)s[left]) != tolower((unsigned char)s[right])) {
return false;
} else {
left++;
right--;
}
}
}
return true;
}
bool isPalindrome_ptr(const char *s) {
size_t len = strlen(s);
if (len > 1) {
const char *left = s;
const char *right = s + len - 1;
while (left < right) {
while (!isalnum((unsigned char)*left)) {
if (++left >= right)
return true;
}
while (!isalnum((unsigned char)*right)) {
if (left >= --right)
return true;
}
if (tolower((unsigned char)*left) != tolower((unsigned char)*right)) {
return false;
} else {
left++;
right--;
}
}
}
return true;
}
英文:
> Where does the large performance difference come from?
There are many potential reasons for a performance difference, but it would be helpful to understand how you compile the code, how you measure performance and with what input. Here are a few hints:
-
the 2 functions posted do not implement the same semantics: in the pointer version, you exit the loop
if (left == right)
whereas the equivalent test in the index version isif (s[left] == '\0')
which is not true in most cases whereif (left == right)
is true. If the functions behave differently, no surprise they perform differently. -
did you enable optimisations? without optimisations,
s[left]
may generate more code than*left
, but with optimisations enabled, most compilers will generate code with similar performance, either converting the index approach to a pointer approach, or using addressing modes that use multiple registers, with or without scaling, without a penalty.
> What method is usually preferred and for what reasons?
Both methods are equivalent if implemented properly, and compile to equivalent performance with modern compilers.
Here are reasons to use pointers:
- Using pointers may be more appropriate if local rules encourage it
- Pointers are quite specific to C (and C++), it makes the code more familiar to die hard C programmers in their 60s
Reasons to use index variables with type size_t
:
- using indices is more general, works in most languages and probably more idiomatic in C++ language (for .
- code is more readable for programmers from various backgrounds
- code is easier to port to other languages.
Before you focus on performance, focus on correctness!
- both functions have undefined behavior on the empty string, a trivial palindrome.
- you have multiple redundant tests.
- the macros
isalnum()
andtolower()
from<ctype.h>
are only defined for a restricted range ofint
values: all values of typeunsigned char
and the special negative valueEOF
expands to. Passing negativechar
values has undefined behavior on platforms where typechar
is signed by default. You should use a cast to(unsigned char)
.
Here are modified versions:
bool isPalindrome_idx(const char *s) {
size_t len = strlen(s);
if (len > 1) {
size_t left = 0;
size_t right = len - 1;
while (left < right) {
while (!isalnum((unsigned char)s[left])) {
if (++left >= right)
return true;
}
while (!isalnum((unsigned char)s[right])) {
if (left >= --right)
return true;
}
if (tolower((unsigned char)s[left]) != tolower((unsigned char)s[right])) {
return false;
} else {
left++;
right--;
}
}
}
return true;
}
bool isPalindrome_ptr(const char *s) {
size_t len = strlen(s);
if (len > 1) {
const char *left = s;
const char *right = s + len - 1;
while (left < right) {
while (!isalnum((unsigned char)*left)) {
if (++left >= right)
return true;
}
while (!isalnum((unsigned char)*right)) {
if (left >= --right)
return true;
}
if (tolower((unsigned char)*left) != tolower((unsigned char)*right)) {
return false;
} else {
left++;
right--;
}
}
}
return true;
}
答案2
得分: 1
将索引示例中的 if (right == 0)
更改为 if (right == left)
,使它们相同。然后它们的性能应该非常相似。
英文:
Change if (right == 0)
in the indexed example to if (right == left)
, to make them the same. Then their performance should be very similar.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论