“常数时间/空间复杂性”概念的混淆。

huangapple go评论75阅读模式
英文:

Confusion in the concept of constant time/space complexity

问题

我对常数时间/空间复杂性的概念感到困惑。

例如:

public void recurse(int x) {
	if(x==0) return;
	else recurse(x/10);
}

其中,1 < x <= 2147483647

如果我们想要用大O符号来表示这个函数的空间复杂性,并计算递归的栈空间,空间复杂性会是多少?

我在以下两个选项之间感到困惑:

  1. O(1) - Java中int的最大值是2147483647,所以最多会递归10次。
  2. O(log x) - 递归次数确实取决于x的位数,所以最多会有大约log10x次递归。

如果我们说它是O(1),那么是不是任何具有有限输入的算法都可以将其时间/空间复杂性限制在某个数字内?那么让我们来看一下在Java中对一组数字进行插入排序的情况。Java中最大的数组大小是2147483647,这是否意味着T(n) = O(21474836472) = O(1)?

还是我应该将其看作O(1)是一个松散的界限,而O(log x)是一个更严格的界限?

以下是我在维基百科上找到的定义:

如果T(n)的值受到不依赖于输入大小的值的限制,则算法被称为常数时间(也写作O(1)时间)。例如,访问数组中的任何单个元素需要常数时间,因为只需要执行一次操作来定位它。类似地,在升序排列的数组中找到最小值;它是第一个元素。然而,在无序数组中找到最小值不是常数时间操作,因为需要扫描数组中的每个元素才能确定最小值。因此,这是一个线性时间操作,花费O(n)时间。然而,如果元素数量事先已知并且不会改变,这样的算法仍然可以说是在常数时间内运行。

英文:

I am confused with the concept of constant time/space complexity.

For example:

public void recurse(int x) {
	if(x==0) return;
	else recurse(x/10);
}

where, 1<x<=2147483647

If we want to express the space complexity for this function in terms of big O notation and count the stack space for recursion, what will be the space complexity?

I am confused between:

  1. O(1) - The maximum value of int in java is 2147483647, so at max it will recurse 10 times.
  2. O(log x) - Number of recursions is really dependent on the number of digits in x, so at max we will have ~log<sub>10</sub>x recursion.

If we say it is O(1), then wouldn't any algorithm which has some finite input can have its time/space complexity bounded by some number? So let's take case of insertion sort in an array of numbers in java. The largest array you can have in java is of size 2147483647, so does that mean T(n) = O(2147483647<sup>2</sup>) = O(1)?

Or should I just look it as, O(1) is a loose bound, while O(log x) is a tighter bound?

Here is the definition I found on wikipedia:

An algorithm is said to be constant time (also written as O(1) time) if the value of T(n) is bounded by a value that does not depend on the size of the input. For example, accessing any single element in an array takes constant time as only one operation has to be performed to locate it. In a similar manner, finding the minimal value in an array sorted in ascending order; it is the first element. However, finding the minimal value in an unordered array is not a constant time operation as scanning over each element in the array is needed in order to determine the minimal value. Hence it is a linear time operation, taking O(n) time. If the number of elements is known in advance and does not change, however, such an algorithm can still be said to run in constant time.

答案1

得分: 3

在分析算法的时间和空间复杂性时,我们必须忽略物理计算机的一些限制;复杂性是一个关于"输入大小"n的函数,在大O符号中,它是当n趋向于无穷大时的渐近上界,但当然物理计算机不能为任意大的n运行算法,因为它有有限的内存和其他存储空间。

因此,为了以有意义的方式进行分析,我们在一个想象中的计算机上分析算法,这个计算机没有数组长度的限制,整数可以足够大以使算法工作,等等。您的Java代码是算法的具体实现,但该算法存在于Java在真实计算机上可能的边界之外的抽象概念中。因此,在没有这些限制的想象计算机上运行这个抽象算法,空间复杂性为O(log n)。

这种"想象计算机"听起来可能有点模糊,但它是可以在数学上形式化的东西,以便进行严格的分析;它被称为计算模型。在实际应用中,除非您进行学术研究,否则您不需要严格分析算法,因此更有用的是逐渐忽略任何限制,这些限制会阻止算法在任意大的输入上运行。

英文:

When analysing the time and space complexity of algorithms, we have to ignore some limitations of physical computers; the complexity is a function of the "input size" n, which in big O notation is an asymptotic upper bound as n tends to infinity, but of course a physical computer cannot run the algorithm for arbitrarily large n because it has a finite amount of memory and other storage.

So to do the analysis in a meaningful way, we analyse the algorithm on an imaginary kind of computer where there is no limit on an array's length, where integers can be "sufficiently large" for the algorithm to work, and so on. Your Java code is a concrete implementation of the algorithm, but the algorithm exists as an abstract idea beyond the boundary of what is possible in Java on a real computer. So running this abstract algorithm on an imaginary computer with no such limits, the space complexity is O(log n).

This kind of "imaginary computer" might sound a bit vague, but it is something that can be mathematically formalised in order to do the analysis rigorously; it is called a model of computation. In practice, unless you are doing academic research then you don't need to analyse an algorithm that rigorously, so it's more useful to get comfortable with the vaguer notion that you should ignore any limits which would prevent the algorithm running on an arbitrarily large input.

答案2

得分: 3

这确实取决于你使用大O表示法的原因

你说得对,从技术上讲,只有在有限数量的可能输入上有效的算法才是O(1)。例如,这将是一个O(1)的排序算法:“读取前10^6个输入位。如果输入中还有更多位,输出“错误”。否则,冒泡排序。”

但表示法的好处在于它通常很好地近似了程序的实际运行时间。虽然O(n)的算法可能会执行10^100 * n次操作,但通常情况下并不是这样,这就是我们使用大O表示法的原因。违反这一规则的例外被称为银河算法,其中最著名的是科普斯密-温格拉德矩阵乘法算法

总之,如果你想在与朋友的争论中保持技术性并取得胜利,你可以说你的算法是O(1)。如果你想实际上使用上界来近似它的速度,你应该想象它适用于任意大的数字,并将其称为O(log(n))。

顺便说一下:将这个算法称为O(log(n))有点不正式,因为从技术上讲,复杂性需要用输入的大小来表示,而不是它的数量级,从而使其变为O(n)。经验法则是:如果你在处理小数字,用数量级来表示复杂性 - 每个人都会理解。如果你在处理可能有数百万位数字的数字,那么就用长度来表示复杂性。在这种情况下,诸如乘法之类的“简单”操作的成本(对于小数字来说,通常被认为是O(1))也需要纳入考虑。

英文:

It really depends on why you are using the big-O notation.

You are correct in saying that, technically, any algorithm is O(1) if it only works for a finite number of possible inputs. For example, this would be an O(1) sorting algorithm: "Read the first 10^6 bits of input. If there are more bits left in the input, output "error". Otherwise, bubblesort."

But the benefit of the notation lies in the fact that it usually approximates the actual running time of a program well. While an O(n) algorithm might as well do 10^100 * n operations, this is usually not the case, and this is why we use the big-O notation at all. Exceptions from this rule are known as galactic algorithms, the most famous one being the Coppersmith–Winograd matrix multiplication algorithm.

To sum up, if you want to be technical and win an argument with a friend, you could say that your algorithm is O(1). If you want to actually use the bound to approximate how fast it is, what you should do is imagine it works for arbitrarily large numbers and just call it O(log(n)).

Side note: Calling this algorithm O(log(n)) is a bit informal, as, technically, the complexity would need to be expressed in terms of size of input, not its magnitude, thus making it O(n). The rule of thumb is: if you're working with small numbers, express the complexity in terms of the magnitude - everyone will understand. If you're working with numbers with potentially millions of digits, then express the complexity in terms of the length. In this case, the cost of "simple" operations such as multiplication (which, for small numbers, is often considered to be O(1)) also needs to be factored in.

答案3

得分: 0

常数时间或空间意味着算法使用的时间和空间不依赖于输入的大小。

常数时间(因此是 O(1))的算法可能如下:

public int square(int x){
    return x * x;
}

因为对于任何输入,它都执行相同的乘法操作,然后结束。

另一方面,对于求一个数组中所有元素的和:

public int sum(int[] array){
    int sum = 0;
    for(int i : array) sum += i;
    return sum;
}

它的时间复杂度为 O(n),其中 n 是数组的大小。它直接依赖于输入的大小。

空间复杂度的行为类似。

任何不依赖于输入大小的东西都被视为常数。

英文:

Constant time or space means that the time and space used by the algorithm don't depend on the size of the input.

A constant time (hence O(1)) algorithm would be

public int square(int x){
    return x * x;
}

because for any input, it does the same multiplication and it's over.

On the other hand, to sum all elements of an array

public int sum(int[] array){
    int sum = 0;
    for(int i : array) sum += i;
    return sum;
}

takes O(n) time, where n is the size of the array. It depends directly on the size of the input.

The space complexity behaves equally.

Any thing that doesn't rely on the size of any input is considered constant.

答案4

得分: 0

当算法不依赖于输入大小时,它被称为具有恒定时间复杂度。例如:

def print_function(input):
  # 这里有 10 行打印操作

在这里,无论你传递什么作为 'input',函数体中的语句总是会运行 10 次。如果你传入 10 作为 'input',会运行 10 条语句。如果你传入 20 作为 'input',仍然会运行 10 条语句。

现在,另一方面,考虑这个例子:

def print_function(input):
  # 这个循环会运行 'input' 次
  for i in range(input):
      print(i)

这个算法会根据输入的大小而运行。如果你将 'input' 设置为 10,循环将运行 10 次;如果你将 'input' 设置为 20,循环将运行 20 次。因此,算法的增长与 'input' 的增长速度相同。在这种情况下,时间复杂度被称为 O(n)。

英文:

When your algo doesnt depend on size of input, it is said to have constant time complexity. For eg:

function print(int input) {
  // 10 lines of printing here

}

Here, no matter what you pass in as 'input', function body statements will always run 10 times. If you pass 'input' as 10, 10 statements are run. If you pass 'input' as 20, still 10 statements are run.

Now on other hand, consider this:

function print(int input) {

  // This loop will run &#39;input&#39; times
  for(int i=0;i&lt;input;i++){
      System.out.println(i);
   }

}

This algo will run depending on the size of input. If you pass 'input' as 10, for loop will run 10 times, If you pass 'input' as 20, for loop will run 20 times. So, algo grows with the same pace as 'input' grows. So, in this case time complexity is said to be O(n)

答案5

得分: 0

应用渐近复杂性到现实世界是棘手的,正如您已经发现的那样。

渐近复杂性处理的是输入大小N没有上限的抽象情况,您只关心在输入大小任意大的情况下会发生什么。

在现实世界中,在您感兴趣的实际应用中,输入大小通常有一个上限。这个上限可能是因为您没有无限的资源(时间/金钱)来收集数据。或者它可能是由技术限制所强加的,比如Java中的int数据类型的固定大小。

由于渐近复杂性分析不考虑现实世界的限制,recurse(x)的渐近复杂性是O(log x)。尽管我们知道x只能增长到2^31。

英文:

Applying asymptotic complexity to the real world is tricky as you have discovered.

Asymptotic complexity deals with the abstract situation where input size N has no upper limit, and you're only interested in what will happen with arbitrarily large input size.

In the real world, in the practical applications you're interested, the input size often has an upper limit. The upper limit may come from the fact that you don't have infinite resources (time/money) to collect data. Or it may be imposed by technical limitations, like the fixed size of int datatype in Java.

Since asymptotic complexity analysis does not account for real world limitations, the asymptotic complexity of recurse(x) is O(log x). Even though we know that x can only grow up to 2^31.

huangapple
  • 本文由 发表于 2020年8月10日 07:25:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/63332309.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定