使用指针与使用索引访问数组的性能差异能否有人解释一下?

huangapple go评论113阅读模式
英文:

Can someone explain to me the performance difference of using pointers vs indices for accessing an array?

问题

Pointers (指针):

使用指针进行回文检查:

```c
bool isPalindrome(char *s) {

    size_t length = strlen(s);
    char *left = s;
    char *right = s + length - 1;

    while (left < right) {
        
        while (left < right && !isalnum(*left))
            ++left;
        
        if (left == right)
            return true;

        while (right > left && !isalnum(*right))
            --right;

        if (right == left)
            return true;
        
        if (tolower(*left) != tolower(*right))
            return false;

        ++left;
        --right;
    }

    return true;
}

Indices (索引):

使用索引进行回文检查:

```c
bool isPalindrome(char *s) {

    size_t length = strlen(s);
    size_t left = 0;
    size_t right = length - 1;

    while (left < right) {
        
        while (left < right && !isalnum(s[left]))
            ++left;
        
        if (s[left] == '
使用索引进行回文检查:

```c
bool isPalindrome(char *s) {

    size_t length = strlen(s);
    size_t left = 0;
    size_t right = length - 1;

    while (left < right) {
        
        while (left < right && !isalnum(s[left]))
            ++left;
        
        if (s[left] == '\0')
            return true;

        while (right > left && !isalnum(s[right]))
            --right;

        if (right == 0)
            return true;
        
        if (tolower(s[left]) != tolower(s[right]))
            return false;

        ++left;
        --right;
    }

    return true;
}
')
return true; while (right > left && !isalnum(s[right])) --right; if (right == 0) return true; if (tolower(s[left]) != tolower(s[right])) return false; ++left; --right; } return true; }
英文:

So I am learning C and to get practice I do some "LeetCode" challenges. For a palindrome checker I have tried two approaches, one using indices and one using pointers.

  • Where does the large performance difference come from?

And as a follow up question:

  • What method is usually preferred and for what reasons?

使用指针与使用索引访问数组的性能差异能否有人解释一下?
使用指针与使用索引访问数组的性能差异能否有人解释一下?

For reference here are the code blocks.

Pointers:

bool isPalindrome(char *s) {

    size_t length = strlen(s);
    char *left = s;
    char *right = s + length - 1;

    while (left &lt; right) {
        
        while (left &lt; right &amp;&amp; !isalnum(*left))
            ++left;
        
        if (left == right)
            return true;

        while (right &gt; left &amp;&amp; !isalnum(*right))
            --right;

        if (right == left)
            return true;
        
        if (tolower(*left) != tolower(*right))
            return false;

        ++left;
        --right;
    }

    return true;
}

Indices:

bool isPalindrome(char *s) {

    size_t length = strlen(s);
    size_t left = 0;
    size_t right = length - 1;

    while (left &lt; right) {
        
        while (left &lt; right &amp;&amp; !isalnum(s[left]))
            ++left;
        
        if (s[left] == &#39;
bool isPalindrome(char *s) {
size_t length = strlen(s);
size_t left = 0;
size_t right = length - 1;
while (left &lt; right) {
while (left &lt; right &amp;&amp; !isalnum(s[left]))
++left;
if (s[left] == &#39;\0&#39;)
return true;
while (right &gt; left &amp;&amp; !isalnum(s[right]))
--right;
if (right == 0)
return true;
if (tolower(s[left]) != tolower(s[right]))
return false;
++left;
--right;
}
return true;
}
&#39;) return true; while (right &gt; left &amp;&amp; !isalnum(s[right])) --right; if (right == 0) return true; if (tolower(s[left]) != tolower(s[right])) return false; ++left; --right; } return true; }

答案1

得分: 3

以下是要翻译的内容:

Where does the large performance difference come from?

大部分性能差异可能来自哪里?但了解您如何编译代码、如何测量性能以及使用什么输入将会有所帮助。以下是一些提示:

  • 所发布的这两个函数没有实现相同的语义:在指针版本中,您会在 if (left == right) 条件下退出循环,而在索引版本中的等效测试是 if (s[left] == '\0'),在大多数情况下 if (left == right) 是不成立的。如果函数行为不同,那么它们性能不同也就不足为奇了。

  • 是否启用了优化?没有启用优化,s[left] 可能会生成比 *left 更多的代码,但启用优化后,大多数编译器将生成性能相似的代码,要么将索引方法转换为指针方法,要么使用使用多个寄存器的寻址模式,有或无比例,没有性能惩罚。

What method is usually preferred and for what reasons?

通常情况下,两种方法如果正确实现的话都是等效的,并且在现代编译器下具有等效的性能。

以下是使用指针的原因:

  • 如果本地规则鼓励使用指针,那么使用指针可能更合适。
  • 指针在 C(和 C++)中相当特殊,这会使代码对那些已经有几十年 C 编程经验的程序员更加熟悉。

以下是使用带有类型 size_t 的索引变量的原因:

  • 使用索引在大多数编程语言中更通用,在 C++ 中可能更符合惯例(对于某些情况)。
  • 对于来自各种背景的程序员来说,代码更易读。
  • 代码更容易移植到其他编程语言中。

在关注性能之前,请确保正确性!

  • 这两个函数都在空字符串(一个平凡的回文串)上具有未定义行为。
  • 您有多个冗余的测试。
  • 来自 <ctype.h> 的宏 isalnum()tolower() 仅对一定范围的 int 值定义:所有 char 类型的值以及特殊的负值 EOF 展开。在 char 类型默认为有符号的平台上,传递负的 char 值具有未定义行为。您应该使用 (unsigned char) 进行强制转换。

以下是修改后的版本:

bool isPalindrome_idx(const char *s) {
    size_t len = strlen(s);
    if (len > 1) {
        size_t left = 0;
        size_t right = len - 1;
        while (left < right) {
            while (!isalnum((unsigned char)s[left])) {
                if (++left >= right)
                    return true;
            }
            while (!isalnum((unsigned char)s[right])) {
                if (left >= --right)
                    return true;
            }
            if (tolower((unsigned char)s[left]) != tolower((unsigned char)s[right])) {
                return false;
            } else {
                left++;
                right--;
            }
        }
    }
    return true;
}

bool isPalindrome_ptr(const char *s) {
    size_t len = strlen(s);
    if (len > 1) {
        const char *left = s;
        const char *right = s + len - 1;
        while (left < right) {
            while (!isalnum((unsigned char)*left)) {
                if (++left >= right)
                    return true;
            }
            while (!isalnum((unsigned char)*right)) {
                if (left >= --right)
                    return true;
            }
            if (tolower((unsigned char)*left) != tolower((unsigned char)*right)) {
                return false;
            } else {
                left++;
                right--;
            }
        }
    }
    return true;
}
英文:

> Where does the large performance difference come from?

There are many potential reasons for a performance difference, but it would be helpful to understand how you compile the code, how you measure performance and with what input. Here are a few hints:

  • the 2 functions posted do not implement the same semantics: in the pointer version, you exit the loop if (left == right) whereas the equivalent test in the index version is if (s[left] == &#39;\0&#39;) which is not true in most cases where if (left == right) is true. If the functions behave differently, no surprise they perform differently.

  • did you enable optimisations? without optimisations, s[left] may generate more code than *left, but with optimisations enabled, most compilers will generate code with similar performance, either converting the index approach to a pointer approach, or using addressing modes that use multiple registers, with or without scaling, without a penalty.

> What method is usually preferred and for what reasons?

Both methods are equivalent if implemented properly, and compile to equivalent performance with modern compilers.

Here are reasons to use pointers:

  • Using pointers may be more appropriate if local rules encourage it
  • Pointers are quite specific to C (and C++), it makes the code more familiar to die hard C programmers in their 60s

Reasons to use index variables with type size_t:

  • using indices is more general, works in most languages and probably more idiomatic in C++ language (for .
  • code is more readable for programmers from various backgrounds
  • code is easier to port to other languages.

Before you focus on performance, focus on correctness!

  • both functions have undefined behavior on the empty string, a trivial palindrome.
  • you have multiple redundant tests.
  • the macros isalnum() and tolower() from &lt;ctype.h&gt; are only defined for a restricted range of int values: all values of type unsigned char and the special negative value EOF expands to. Passing negative char values has undefined behavior on platforms where type char is signed by default. You should use a cast to (unsigned char).

Here are modified versions:

bool isPalindrome_idx(const char *s) {
    size_t len = strlen(s);
    if (len &gt; 1) {
        size_t left = 0;
        size_t right = len - 1;
        while (left &lt; right) {
            while (!isalnum((unsigned char)s[left])) {
                if (++left &gt;= right)
                    return true;
            }
            while (!isalnum((unsigned char)s[right])) {
                if (left &gt;= --right)
                    return true;
            }
            if (tolower((unsigned char)s[left]) != tolower((unsigned char)s[right])) {
                return false;
            } else {
                left++;
                right--;
            }
        }
    }
    return true;
}

bool isPalindrome_ptr(const char *s) {
    size_t len = strlen(s);
    if (len &gt; 1) {
        const char *left = s;
        const char *right = s + len - 1;
        while (left &lt; right) {
            while (!isalnum((unsigned char)*left)) {
                if (++left &gt;= right)
                    return true;
            }
            while (!isalnum((unsigned char)*right)) {
                if (left &gt;= --right)
                    return true;
            }
            if (tolower((unsigned char)*left) != tolower((unsigned char)*right)) {
                return false;
            } else {
                left++;
                right--;
            }
        }
    }
    return true;
}

答案2

得分: 1

将索引示例中的 if (right == 0) 更改为 if (right == left),使它们相同。然后它们的性能应该非常相似。

英文:

Change if (right == 0) in the indexed example to if (right == left), to make them the same. Then their performance should be very similar.

huangapple
  • 本文由 发表于 2023年7月3日 03:00:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76600385.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定