英文:
How to find time complexity of connected components graph algorith,
问题
我正在尝试找出以下算法的时间复杂度。到目前为止,我有两个选项:O(VE)和O(V + E)(其中V是顶点数,E是边数)。但我还是搞不清楚。
以下是算法:
对于图中的每个节点:
将其组件设置为其值; // O(n)
对于图中的每条边: // O(v)
如果其端点在不同的组件中
将每个组件中的所有节点设置为相同的组件 // O(n)
我已经得到了特定部分的最坏情况时间复杂度(如上所示),但不知道整体的时间复杂度如何计算。谢谢!
我尝试计算算法各个部分的复杂度,但不知道如何从中得到整体复杂度。
我有另一种算法。对于邻接表,我猜测它的时间复杂度为O(EV + V^2)。对于这个算法,我比第一个更有信心,但如果有人能确认一下就好了!
components(G):
对于图中的每个顶点v:
componentOf[v] = -1
compID = 0 // 组件ID
对于图中的每个顶点v:
如果componentOf[v] == -1:
dfsComponent(G, v, compID)
compID = compID + 1
dfsComponent(G, v, id):
componentOf[v] = id
对于每个(v,w) ∈ edges(G):
如果componentOf[w] == -1:
dfsComponent(G, w, id)
英文:
I am trying to find the time complexity of the following algorithm. So far I have two options: O(VE) and O(V + E) (where V is the number of vertices and E the number of edges). I just can't figure it out though.
Here is the alorithm:
for each node in the graph:
set its component to its value; // O(n)
for every edge in the graph: // O(v)
if its end vertices are in different components
set all the nodes in each of their components to the same component // O(n)
I have the worst case time complexities of specific parts (indicated above) but not the whole thing. How do I get the overall time complexity from this?
thanks!
I tried getting the complexity of individual sections of the algorithm. I don't know how to get the overall complexity from this.
I have this alternative algorithm. For an adjacency list, I am guessing this would be O(EV + V^2). I am more confident about this one than the first one, but if someone could confirm that would be great!
components(G):
for all vertices v ∈ G:
componentOf[v] = -1
compID = 0 // component ID
for all vertices v ∈ G:
if componentOf[v] == -1:
dfsComponent(G, v, compID)
compID = compID + 1
dfsComponent(G, v, id):
componentOf[v] = id
for each (v,w) ∈ edges(G):
if componentOf[w] == -1:
dfsComponent(G, w, id)
答案1
得分: 0
由于输入的变量是𝑉和𝐸,在你的分析中不应该有类似O(𝑛)的内容。此外,第二个(外部)循环的迭代次数取决于𝐸而不是𝑉。
其次,“将每个组件中的所有节点设置为相同的组件”这一步可以用不同的方式实现:
- 选择一个新的组件编号,并更新两个组件中的节点。
- 读取第一个组件中使用的组件编号,并将第二个组件中的节点更新为该组件编号。
- 检查两个组件中哪个最小(通过维护每个组件编号的大小),并更新该组件中的节点以获取另一个组件的组件编号(并更新所选组件编号的大小)。
变体1
在第一个变体中,最坏情况发生在我们访问的每条边都连接一个单独的组件与一个迄今为止仍然是单独组件的节点。这意味着我们必须更新逐渐增加数量的节点的组件编号:2 + 3 + 4 + ... + 𝑉。这比三角数少1,因此表示为O(𝑉²)。此外,我们还必须访问所有边,包括连接已知属于同一组件的节点的边,因此时间复杂度为O(𝐸 + 𝑉²)。
变体2
在最坏情况下,第二个变体与第一个变体具有相同的复杂度,因为我们可能不幸地总是使用单独组件的节点的组件编号来更新大组件的节点。
变体3
第三个变体对此进行了改进。
如果我们在第一个算法中进行了修正,那么在之前的变体中的最坏情况(将一个迄今为止孤立的节点添加到一个增长的组件中)将成为这里的最佳情况:每个添加的边将表示O(1)的工作(仅更新一个节点的组件编号)。这对于这个变体来说不是最坏情况。现在,如果两个节点属于大小相似的组件,那么最坏情况发生。我们可以想象大小为1 + 1 = 2,2 + 2 = 4,...等。我们可以看到特定节点最多会在log𝑉次合并中成为一部分,因为它所属的组件在每次合并时大小会加倍。这导致时间复杂度为O(𝐸 + 𝑉log𝑉)。
DFS算法
你在最后提出的DFS算法再次进行了改进。如果我们将图视为无向图,那么一条边最多会被遍历两次:一次是正向遍历,一次是反向遍历,并且第二次不会有递归调用,因为终点节点已经被访问过(并且有一个组件编号)。我假设边的列表(v,w)
可以从邻接表中检索出来,其时间与节点v
的边数成线性关系(当我们有邻接矩阵时,情况就不同了)。
就节点而言,我们可以看到在主循环中至少需要访问每个节点一次。
因此,它的时间复杂度为O(𝐸 + 𝑉)。
英文:
As the variables of the input are 𝑉 and 𝐸, there shouldn't be something like O(𝑛) in your analysis. Also, the second (outer) loop will have a number of iterations that depends on 𝐸, not 𝑉.
Secondly, the step "set all the nodes in each of their components to the same component" can be implemented in different ways:
- Choose a new component number and update the nodes in both components
- Read the component number used in the first component, and update the nodes in the second component to that component number
- Check which of the two components is the smallest (by maintaining a size per component number), and update the nodes in that component to get the component number of the other component (and update the size of the chosen component number).
Variant 1
In the first variant, the worst case occurs when each edge we visit, connects a single component with a node that is so far a component on its own. That means we have to update the component number of a steadily increasing number of nodes: 2 + 3 + 4 + ... + 𝑉. This is one less than a triangular number, and so represents O(𝑉²). Add to this that we have to visit all edges, also those that connect nodes that are already known to belong to the same component, and we end up with O(𝐸 + 𝑉²).
Variant 2
In the worst case, the second variant has the same complexity as the first, as we may be unlucky and always update the nodes of the large component with the component number of the node that is a component on its own.
Variant 3
The third variant improves on this.
If we correct that in the first algorithm, then what would be a worst case in the previous variants (adding a so-far isolated node to a growing component), becomes a best case here: each added edge would then represent O(1) work (updating the component number of one node only). This is not the worst case for this variant. Now it is worst case if the two nodes belong to components that are about the same size. We can imagine sizes 1 + 1 = 2, and 2 + 2 = 4, ...etc. We can see that a specific node will be part of a merge at most log𝑉 times, because the component it belongs to will double in size at each merge. This leads to a time complexity of O(𝐸 + 𝑉log𝑉).
DFS algorithm
The DFS algorithm you presented at the end is again an improvement. If we consider the graph to be undirected, then an edge will be traversed at most two times: once in each direction, and the second time no recursive call will follow, as the end node will already have been visited (and have a component number). I assume the list of edges (v,w)
can be retrieved from an adjacency list in a time that is linear to the number of edges the node v
has (which would not be the case when we have an adjacency matrix).
In terms of nodes we can see we will have to visit each node at least once in the main loop.
So its time complexity is O(𝐸 + 𝑉).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论