打印二叉搜索树所有路径的时间复杂度是多少?

huangapple go评论78阅读模式
英文:

What is the time complexity of printing all paths of a BST

问题

以下是翻译好的部分:

我有这段代码,与该站点上相同问题的某些其他代码略有不同:

public void printAllRootToLeafPaths(Node node, ArrayList path) {
    if (node == null) {
        return;
    }
    path.add(node.data);

    if (node.left == null && node.right == null) {
        System.out.println(path);
        return;
    } else {
        printAllRootToLeafPaths(node.left, new ArrayList(path));
        printAllRootToLeafPaths(node.right, new ArrayList(path));
    }
}

有人可以解释一下为什么最佳情况下的时间复杂度是O(nlogn)吗?
最坏的情况下,树分解为链表,数组被复制了1+2+3+...+n-1+n次,这相当于n^2,因此时间复杂度是O(n^2)。
最佳情况下,数组沿着路径每个节点都被复制。所以复制看起来像是1+2+3+...+logn。一共N/2次。那么为什么不是求和(logn)^2,如果1+2+3+...+n-1+n是n^2的话?这不是最佳情况的时间复杂度是O(n(logn)^2)吗?

英文:

I have this code, which is slightly different than the some of the other code with the same question on this site:

public void printAllRootToLeafPaths(Node node,ArrayList path) {
    if(node==null){
        return;
    }
    path.add(node.data);

    if(node.left==null && node.right==null)
    {
        System.out.println(path);
        return;
    }
    else {
        printAllRootToLeafPaths(node.left, new ArrayList(path));
        printAllRootToLeafPaths(node.right,new ArrayList(path));
    }      
}

Can someone explain WHY best case is O(nlogn)?
Worst case, the tree breaks down to a linked list and the array is copied 1+2+3+...+n-1+n times which is equivalent to n^2 so the time complexity is O(n^2).
Best case array is being copied along every node in the path. So copying it looks like 1+2+3+...+logn. N/2 times. So why isn't the summation (logn)^2 if 1+2+3+...+n-1+n is n^2? Making the best case O(n(logn)^2)?

答案1

得分: 1

是的,所有这些复制都需要考虑在内。

这些副本是不必要的;很容易编写以下函数:

  • 使用(单向)链表代替 ArrayList,或者
  • 使用单个 ArrayList,在不再需要路径时覆盖先前的路径。

对于非复制算法,算法的成本是打印路径的成本,即从根节点到叶子节点的所有路径长度之和。在每次调用时复制数组,会使成本变为从根节点到每个节点的路径长度之和,而不仅仅是每个叶子节点。 (数组被复制两次的事实实际上与复杂性分析无关;它只意味着乘以一个常数因子,可以忽略。)

使用非复制算法,最好的情况是线性树,每个节点只有一个子节点。然后只有一个具有路径长度为N的叶子节点,因此总成本为O(N)。但如果在每个节点都复制,那么这就是最坏情况的输入;节点路径长度是连续的整数,并且节点路径长度的总和是二次的。

对于您的算法,最好的情况是完全平衡的满树。在满树中,叶子节点的数量比非叶节点多一个;换句话说,大约一半的节点是叶子节点。在完全平衡树中,每个节点从根节点到达的最大步数为log N。因此,到每个节点的路径长度之和为O(N log N)。 (有些节点更近,但为了计算大O,我们可以忽略这个事实。即使我们要考虑它,我们会发现它不会改变渐近行为,因为每个连续级别的深度级别上的节点数量都会增加一倍。)因为一半的节点是叶子节点,所以对于非复制算法,这个输入的成本也是O(N log N)。

两种算法都表现出最坏情况下的二次复杂性。我们已经看到了复制算法的最坏情况输入:只有一个叶子的线性树。对于非复制算法,我们使用类似的情况:由左链组成的主干树,其中每个右链都是叶子:

                     根节点
                      /\
                     /  \
                    /\   7
                   /  \
                  /\   6
                 /  \
                /\   5
               /  \
              /\   4
             /  \
            /\   3
           /  \
           1  2

由于这棵树是完全平衡的,一半的节点是叶子节点,它们的距离逐渐增加,因此这也是二次的(尽管常数乘数较小)。

英文:

Yes, all that copying needs to be accounted for.

Those copies are not necessary; it's easy to write functions which either:

  • use a (singly) linked list instead of ArrayList, or
  • use a single ArrayList, overwriting previous paths when they're not needed any more.

With non-copying algorithms, the cost of the algorithm is the cost of printing the paths, which is the sum of the lengths of all paths from the root to a leaf. Copy the array on every call, makes the cost become the sum of the paths from the root to every node, rather than every leaf. (The fact that the array is copied twice is really not relevant to complexity analysis; it just means multiplying by a constant factor, which can be ignored.)

With the non-copying algorithm, the best case is a linear tree, with only one child per node. Then there is just one leaf with a path length of N, so the total cost is O(N). But that's a worst-case input if you're copying at every node; the node path lengths are successive integers, and the sum of node path lengths is quadratic.

For your algorithm, the best case is a perfectly-balanced fully-occupied tree. In a fully-occupied tree, there is one more leaf than non-leaf nodes; in other words, approximately half the nodes are leaves. In a perfectly-balanced tree, every node can be reached from the root in a maximum of log N steps. So the sum of path lengths to every node is O(N log N). (Some nodes are closer, but for computing big O we can ignore that fact. Even if we were to take it into account, though, we'd find that it doesn't change the asymptotic behaviour because the number of nodes at each depth level doubles for each successive level.) Because half the nodes are leaves, the cost of this input is O(N log N) with the non-copying algorithm as well.

Both algorithms exhibit worst-case quadratic complexity. We've already seen the worst-case input for the copying algorithm: a linear tree with just one leaf. For the non-copying algorithm, we use something very similar: a tree consisting of a left-linked backbone, where every right link is a leaf:

                         root
                          /\
                         /  \
                        /\   7
                       /  \
                      /\   6
                     /  \
                    /\   5
                   /  \
                  /\   4
                 /  \
                /\   3
               /  \
               1  2

Since that tree is fully-occupied, half its nodes are leaves, and they are at successively increasing distances, so that is again quadratic (although the constant multiplier is smaller.)

答案2

得分: 0

我认为您忽略了将整个数组复制到新数组的时间复杂度。


在根节点:增加一个元素,但创建了2个新数组

在根节点的左子节点:增加一个元素,但创建了2个具有2个元素的新数组

.
.
.

在倒数第二层:增加一个元素,但创建了大小为以2为底的log n的2个新数组

在最后一层:增加一个元素,并打印log n个元素。


因此,在每个节点上,有一个添加元素到列表的操作和一个添加大小为(log h)的列表的操作 - 其中h是树中节点的高度。

由于我们只遍历所有元素一次,总操作数为:n次添加 + 求和(log h1 + log h2 + log h3 + .... log hn)

其中h1、h2、h3...hn是每个节点的高度。

这大约是 n + O(n log n) ~ O(n log n)

英文:

I think you're missing the time complexity of copying the entire array to a new one.


At root node: one element is added, but 2 new arrays are created.

At left child node of root node: one element is added, but 2 new arrays are created with 2 elements.

.
.
.

At last but one level: one element is added, but 2 new arrays are created of size log n to base 2.

At last level: one element is added, and the log n elements are printed.


So at each node there is one operation of adding an element to the list and one operation of either printing or duplicating a list of size (log h) - where h is the height of the node in the tree.

Since we are traversing through all the elements only once, total number of operations are: n additions + Sum (log h1 + log h2 + log h3 + .... log hn)

Where h1, h2, h3...hn are heights of each node.

Which is approximately n + O(n log n) ~ O(n log n).

huangapple
  • 本文由 发表于 2020年7月24日 05:19:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/63063279.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定