在二分查找中,最大关键比较次数为(1+log(n)),而不是log(n)。

huangapple go评论73阅读模式
英文:

How is maximum key compare in binary search (1+log(n)) not log(n)?

问题

我重新浏览了一个更通用的二分搜索解释。其中提到,在二分搜索中,最大比较次数的上限是(1+log(n))。我试图形成一个关于为什么会出现这种情况的直觉。我理解二分搜索的运行时间是(log(n))。我还猜测,在搜索不在数组中的元素时,最糟糕的情况/最大的关键查找次数会发生。但是,每次我对一个假设的数组进行二分搜索时,最终的比较/步骤次数都不会超过(log(n))。以下是用于形成这个前提的代码。

public static int binarySearch(int[] a, int key){
     int lo = 0, hi = a.length-1;
     while (lo <= hi){
         int mid = lo + (hi - lo) / 2;

         if (key < a[mid]) 
            hi = mid - 1;
         else if (key > a[mid]) 
            lo = mid + 1;
         else 
            return mid;
     }
     return -1;
 }

也许我的猜测是错误的,或者我漏掉了一些要点。如果您能解释一下为什么最大可能的关键比较次数会是(1+log(n)),那将非常好,谢谢。

英文:

I was re going through a more versatile explanation of BS. Which has a preposition that maximum number of key compare in a binary search is (1+log(n)). I tried to form an intuition as to how is that possible. I understand the ruining time of BS is (log(n)). Also i speculate that the worse time/maximum key lookups will happen in scenario when we search for a element which is not present in the array. But each time i go over a hypothetical array doing a BS i end up with (log(n)) compares/steps never more than that. Here is the code which was used to form that preposition.

public static int binarySearch(int[] a, int key){
     int lo = 0, hi = a.length-1;
     while (lo &lt;= hi){
         int mid = lo + (hi - lo) / 2;

         if (key &lt; a[mid]) 
            hi = mid - 1;
         else if (key &gt; a[mid]) 
            lo = mid + 1;
         else 
            return mid;
     }
     return -1;
 }

Probably my speculation is wrong or i am missing some point. If you could explain how maximum possible key compares would be (1+log(n)) it would be great thanks.

答案1

得分: 3

不要忘记,即使只有1个元素,您仍然必须猜测,因为目标可能不在数组中。因此,当我们只剩下最后一个元素时,我们需要在最后一次猜测上加上+1。

如果您考虑 n=1 的情况,可能会更清楚。我们仍然需要1次猜测,但是 log_2(1) = 0。因此,我们需要添加一个 +1 来修正公式。

n 不是2的幂时,我们可以直接上升到下一个较高的2的幂次。对于长度为1000的数组,下一个较高的2的幂次是1024,即10次。因此,对于一个包含1000个元素的数组,二分查找最多需要11次猜测(10 + 1)。

> 为什么呢?

在最坏的情况下,二分查找需要10步来分离剩余的数字,然后再进行1步最终检查,以确定仅剩下的一个数字是否是您想要的,或者它不在数组中。

英文:

Don't forget that even when you have only 1 element, you still have to guess it, because it is possible that the target is not in the array. So we need to add on the +1 for the last guess when we are down to the final remaining element.

It may be clearer if you think about when n=1. We still need 1 guess, but log_2(1) = 0. So we need to add a +1 to fix up the formula.

When n is not a power of 2, we can just go up to the next higher power of 2. For an array whose length is 1000, the next higher power of 2 is 1024, which is 10. Therefore, for a 1000-element array, binary search would require at most 11 (10 + 1) guesses.

> Why?

In the worst case, binary search would need 10 steps to separate the remaining numbers and 1 final step to check whether the only one number left is what you want, or it's not in the array.

答案2

得分: 1

这里有一种不同的思考方式,来理解你正在做的事情。与其把二分查找看作是在数组元素上进行搜索,不如将二分查找视为在数组元素之间的分隔符上进行搜索。具体来说,想象将数组编号如下:

   +-----+-----+-----+-----+-----+
   |  0  |  1  |  2  | ... | n-1 |
   +-----+-----+-----+-----+-----+

现在,给分隔符编号:

   +-----+-----+-----+-----+-----+
   |  0  |  1  |  2  | ... | n-1 |
   +-----+-----+-----+-----+-----+
   0     1     2     3 .. n-1    n

注意,总共有 n+1 个分隔符,一个在每个元素之前,一个在最后一个元素之后。

无论何时进行二分查找,你都在探测中间的分隔符的索引(你看出为什么了吗?),并且舍弃一半的分隔符。在将一组 k 个项的一半舍弃 log2 k 次之后,你就会剩下一个单独的元素。这意味着所需的探测次数将是 ⌈log2 (n+1)⌉,事实上,

> log2 n < &lceil;log2 (n+1)⌉ ≤ log2 n + 1,

所以 "1 + log n" 这一部分更多地来自于 "舍弃一半的分隔符" 而不是其他来源。

希望这对你有所帮助!

英文:

Here's a different way to think about what you're doing. Rather than thinking of binary search as searching over the elements of the array, think of binary search as searching over the separators between the elements in the array. Specifically, imagine numbering the array like this:

   +-----+-----+-----+-----+-----+
   |  0  |  1  |  2  | ... | n-1 |
   +-----+-----+-----+-----+-----+

Now, number the separators:

   +-----+-----+-----+-----+-----+
   |  0  |  1  |  2  | ... | n-1 |
   +-----+-----+-----+-----+-----+
   0     1     2     3 .. n-1    n

Notice that there are n+1 total separators, one before each element and one after the very last element.

Whenever you do a binary search, you're probing the index of the middle separator (do you see why?) and throwing half of the separators away. You can only throw half of a collection of k items away log<sub>2</sub> k times before you're down to a single remaining element. This means that the number of probes needed will be &lceil;log<sub>2</sub> (n+1)&rceil;, and it happens to be the case that

> log<sub>2</sub> n < &lceil;log<sub>2</sub> (n+1)&rceil; &le; log<sub>2</sub> n + 1,

so the "1 + log n" bit ends up arising more from "throw away half the separators" than from other sources.

Hope this helps!

答案3

得分: 0

假设有一个大小为8的数组。

l=0,h=7,mid=0 + (7-0)/2 = 3 向右走
l=4,h=7,mid=4 + (7-4)/2 = 5 向右走
l=6,h=7,mid=6 + (7-6)/2 = 6 向右走
l=7,h=7,mid=7 + (7-7)/2 = 7 向左走
l=7,h=6 ====> 终止

总比较次数 = 1 + Log 8 = 4

编辑1:想象一下这个数组,并使用纸和笔来追踪上述步骤。搜索值为13。

索引: 0 1 2 3 4 5 6 7
------------------------------
元素: 1 3 5 6 7 9 11 15

英文:

Imagine an array of size 8.

l=0, h = 7, mid =  0 + (7-0)/2 = 3   go right
l=4, h = 7, mid =  4 + (7-4)/2 = 5   go right
l=6, h = 7, mid =  6 + (7-6)/2 = 6   go right
l=7, h = 7, mid =  7 + (7-7)/2 = 7   go left  
l=7, h=6 ====&gt; terminates

> Total comparisons = 1 + Log 8 = 4

EDIT1: Imagine this array and use a pen and paper and trace out the above steps. Search for value 13.

index:      0  1  2  3  4  5  6   7   
            ------------------------------
element:    1  3  5  6  7  9  11  15  

huangapple
  • 本文由 发表于 2020年5月3日 15:15:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/61570880.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定