英文:
Is big O notation used to denote something other than the asymptote of a worst case performance?
问题
我最近遇到了这种表示法,对我看到的内容感到困惑。我先前的理解是,大O符号应该用来表示一个上界,而下界或紧界使用完全不同的符号(omega和theta)。然而,这种理解并不能帮助我理解人们所说的“最好/最坏/平均情况下的O(...)”,因为它似乎总是应该指的是最坏情况。
举个例子,我看到有人声称“哈希映射的平均情况下是O(1)”。另外,使用布谷鸟哈希插入有时被说成“预计是O(1)”,因为踢出其他键的概率是有界的。
在这两种情况下,大O符号似乎用来描述除最坏情况复杂度的其他事情。例如,在最坏情况下,我们应该期望每次插入普通哈希映射都会发生碰撞;“平均最坏情况”对我来说并不太合理。类似地,“预期最坏情况”也似乎很奇怪。我似乎漏掉了一些重要的东西。我理解错了什么?
我已经尝试通过查阅维基百科、可汗学院等类似资源来查看是否我误解了大O符号。它们似乎确认了大O符号仅用于表示最坏情况。
英文:
I've come across the notation recently am confused by what I see. My previous understanding is that big O is supposed to be used to denote an upper bound whereas lower or tight bound uses different notations entirely (omega and theta respectively). This understanding however doesn't help me understand what people mean by "best/worst/average case O(...)", because it seems like O(...) should always refer to the worst case.
As an example, I see claims to the effect that "hash maps are on average O(1)". As another, insertion with cuckoo hashing is sometimes said to be "expected to be O(1)" because how the probability of kicking out other keys is bounded.
In both of these cases, big O notation seems to be used to describe something other than the asymptote of the worst case complexity. For example in the worst case we should expect every insertion into a normal hash map to be a collision; "average worst case" doesn't really make sense to me. "Expected worst case" seems similarly strange. I seem to be missing something important. What am I misunderstanding?
I've tried to see if I misunderstood big O notation by looking up e.g. wikipedia, khan academy, and similar resources. They seem to confirm that big O is used to denote worst case scenario only.
答案1
得分: 4
这是学习复杂性分析时常见的困惑源泉,很多资源在解释这个问题时做得不够好。答案是大O符号并不专门用于描述最坏情况复杂性。大O、大Omega和大Theta都是用来分析_数学函数_的工具,无论这些函数描述什么(它们实际上不需要描述任何东西)。
在计算机科学中,我们通常将大O符号应用于描述算法最坏情况运行时间复杂性的函数,因此容易引发混淆。但正如你已经发现的,将它应用于描述平均情况、最坏情况或其他情况的函数也是有效的。然而,在大多数情况下,大O通常是应用于最坏情况的最_有用_符号,因为结果表达式也适用于算法在_所有_情况下的复杂性:如果某个算法的最坏情况是_O(n^2)_,这意味着它永远不会更糟,因此无论输入如何,复杂性都将由某个不比_n^2_增长更快的函数描述。
即便如此,将其他函数应用于最坏情况也是完全有效的,例如,最坏情况下复杂性函数恰好为_3n^2 + 4n + 8_的算法也可以说是最坏情况下的_Omega(n^2)。通常不这样做的原因是它会产生一个不够有用的陈述,因为该陈述允许复杂性任意高,例如_2^n。然而,有一个非常有趣的将大Omega应用于最坏情况复杂性的应用:表示基于比较的排序在最坏情况下是_Omega(n lg n)!这意味着不可能创建一个排序算法,其中核心操作是比较数字对,而且它始终比_n lg n_更快 - 总会有一些输入需要_n lg n_时间或更多。算法在某些情况下可能更快,因此基于比较的排序的最佳情况不是_Omega(n lg n),但最坏情况是。
英文:
This is a common source of confusion when learning complexity analysis, and many resources do a poor job of explaining it. The answer is that the big-O notation is not specific to worst case complexity. big-O, big-Omega and big-Theta are all tools for analyzing mathematical functions, no matter what those functions describe (they don't actually need to describe anything at all).
In computer science, it just so happens that we most often apply the big-O notation to functions that describe the worst case runtime complexity of an algorithm - hence the confusion. But as you have discovered, it is also valid to apply it to functions that describe average case, worst case, or anything else. However, in most situations, big-O is usually the most useful notation to apply to the worst case, because the resulting expression also holds for the algorithm's complexity in all cases: if the worst case of some algorithm is O(n<sup>2</sup>), that means that it can never be worse than that, so the algorithm's complexity in all cases is O(n<sup>2</sup>): that is, no matter the input, the complexity will be described by some function that does not grow faster than n<sup>2</sup>.
Even so, it is perfectly valid to apply the other functions to worst case - for example, an algorithm whose complexity function in the worst case is exactly 3n<sup>2</sup> + 4n + 8 can also be said to be Omega(n<sup>2</sup>) in the worst case. The reason this is not usually done is that it produces a less useful statement, since the statement allows for the possibility that the complexity is arbitrarily high, e.g. 2<sup>n</sup>. There is, however, a very interesting application of big-Omega to worst case complexity: to express that comparison-based sorting is Omega(n lg n) in the worst case! This means that it is impossible to create a sorting algorithm where the central operation is to compare pairs of numbers and have it always be faster than n lg n - there will always be some inputs where the algorithm will require n lg n time or more. The algorithm might be faster in some cases, so it is not true that the best case for comparison-based sorting is Omega(n lg n), but the worst case is.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论