‘if key == arr[low]: return low’ 在插值搜索中是做什么的?

huangapple go评论67阅读模式
英文:

What is the 'if key == arr[low]: return low' doing in Interpolation search?

问题

I've seen a few implementations of Interpolation search in Python that uses

if key == arr[low]:
    return low

After the while-loop, like this:

def interpolation_search(arr, key):

    low = 0
    high = len(arr) - 1

    while arr[high] != arr[low] and arr[low] <= key <= arr[high]:
        mid = low + ((key - arr[low]) * (high - low) // (arr[high] - arr[low]))

        if key == arr[mid]:
            return mid
        elif key < arr[mid]:
            high = mid - 1
        else:
            low = mid + 1

    if key == arr[low]:
        return low

    return -1

What is that doing? I've run a bunch of tests with various kinds of lists (even distribution, uneven, etc) and searching for every item in the list but I haven't been able to see a difference regardless of if that if-statement is there or not.

An example of the algorithm with the if-statement: Link

And without the if-statement: Link

I've run the code on arr = range(200) as well as a sorted array with length 200 containing random integers from 0 - 1000, with a few duplicates. The if-statement doesn't change the outcome.

英文:

I've seen a few implementations of Interpolation search in Python that uses

if key == arr[low]:
    return low

After the while-loop, like this:

def interpolation_search(arr, key):

    low = 0
    high = len(arr) - 1

    while arr[high] != arr[low] and arr[low] &lt;= key &lt;= arr[high]:
        mid = low + ((key - arr[low]) * (high - low) // (arr[high] - arr[low]))

        if key == arr[mid]:
            return mid
        elif key &lt; arr[mid]:
            high = mid - 1
        else:
            low = mid + 1

    if key == arr[low]:
        return low

    return -1

What is that doing? I've run a bunch of tests with various kinds of lists (even distribution, uneven, etc) and searching for every item in the list but I haven't been able to see a difference regardless of if that if-statement is there or not.

An example of the algorithm with the if-statement: https://www.techiedelight.com/interpolation-search/

And without the if-statement: https://www.baeldung.com/java-interpolation-search

I've run the code on arr = range(200) as well as a sorted array with length 200 containing random integers from 0 - 1000, with a few duplicates. The if-statement doesn't change the outcome.

答案1

得分: 0

以下是翻译好的部分:

"while" 循环下面的 "if" 语句可能是用来捕捉输入数组仅由相同值组成的情况,例如 [9, 9, 9, 9],甚至只有一个值 [9] 的情况。在这种情况下,"while" 循环永远不会进入,"if" 语句变得必要:

interpolation_search([9, 9, 9, 9], 9)

interpolation_search([9], 9)

都会进入 "if" 语句体。

此外注意:

  • 当值不存在时,应该引发异常,而不是返回字符串。在我看来,当值未找到时返回一个字符串是一个不好的选择,因为仍然返回一个单一的值,而且函数似乎已成功运行。但稍后,当该值用作数组的索引时,会出现一个由这个函数引起的异常,但异常不会显示出来。

  • 在 Python 中,你可以直接使用 arr.index(key),这会替代这整个函数(可能更加优化)。可以说,你在这里使用的是一种 C 风格的函数,恰好是用 Python 编写的,但实际上不应该在 Python 中使用。此外,这忽略了 NumPy,对于大型数组而言,NumPy 更加适用(变量名 arr 更适合表示 NumPy 数组,而不是像 lstmy_list 这样的名字,后者更倾向于表示 Python 列表)。

英文:

The if-statement below the while loop may be there to catch the case where the input array consists of only the same values, like [9, 9, 9, 9]; or even of one value, [9]. In that case, the while loop is never entered, and the if statement becomes necessary:

interpolation_search([9, 9, 9, 9], 9)

and

interpolation_search([9], 9)

will both enter the if-statement body.


Side notes:

  • When the value is not present, an exception should be raised, instead of a string returned.
    The return value of a string when the value is not found is a bad choice, in my opinion, since there is still a single value returned, and it appeared the function ran successfully. But when later on, that value is used as an index into an array, there'll be an exception that has its roots in this function, but the exception won't show that.

  • in Python, you would just do arr.index(key), which replaces this whole function (and may be more optimised). Arguably, what you have here, is a C-style function, which happens to be coded in Python, but really shouldn't be used in Python. Also, this ignores NumPy, which would be more suitable for large arrays (the variable name arr is more indicative of NumPy arrays though, instead of something like lst or my_list, the latter which tends to point to Python lists).

huangapple
  • 本文由 发表于 2023年2月6日 18:12:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/75359917.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定