获取从源节点到所有节点的最短距离优化

huangapple go评论59阅读模式
英文:

Get shortest distance from src to all nodes optimization

问题

以下是您要翻译的内容:

我有一个输入作为[][]edges。数组的列长度为2。因此,2D数组的每一行都有2个元素。每个元素都是一个顶点。它是双向的,即我们可以说边是双向的。因此,如果我们遍历这个2D数组,我们可以说我们有一个无向图。

我试图找到从一个特定节点到所有节点的最短距离。在这种情况下,从节点0到所有现有节点。

我有一个有效的代码,但我认为我在重新计算我想要避免的东西。我一次又一次地调用函数computeDistPerNode(m,0,key);,我确信它正在重新计算从0到之前调用中看到的节点的距离。我无法优化它并利用过去的计算。我该如何做?

以下是没有优化的工作代码

public Map<Integer, List<Integer>> createUnDirectedGraph(int [][]edges) {
    Map<Integer, List<Integer>> m = new HashMap<>();
    for(var i = 0; i<edges.length; i++) {
        m.put(edges[i][0], new ArrayList<>());
        m.put(edges[i][1], new ArrayList<>());
    }
    for(var edge:edges) {
        var v1 = edge[0];
        var v2 = edge[1];
        m.get(v1).add(v2);
        m.get(v2).add(v1);
    }
    return m;
}

public int[] getShortestDistances(Map<Integer, List<Integer>> m) {
    int distance[] = new int[m.size()];
    for(Integer key:m.keySet()) {
       var d = computeDistPerNode(m,0,key);
       distance[key] = d;
    }
    return distance;
}
public int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest) {
    Queue<Integer> q = new LinkedList<>();
    Integer dist[] = new Integer[m.size()];
    Set<Integer> visited = new HashSet<>();
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;
    q.add(src);
    while(!q.isEmpty()) {
       var currNode = q.poll();
       if(visited.contains(currNode)) continue;
       visited.add(currNode);
       if(currNode == dest) {
           return dist[dest];
       }


       for(var child: m.get(currNode)) {
           if (visited.contains(child)) {
               continue;
           }

           q.offer(child);
           var newDist = 1 + dist[currNode];
           if(newDist<dist[child]) {
               dist[child] = newDist;
           }
       }
    }
    return -1;
}

public int[][] getsample() {
    int [][] edges = {
            {0,1},
            {0,2},
            {1,4},
            {2,3},
            {4,3},
            {0,4},
    };
    return edges;
}
英文:

I have an input as [][]edges. The col length of array is 2. Each row of the 2D array hence has 2 elements. Each element is a vertex. And it is bidirectional i.e we can say the edge is in both directions. Hence if we go through this 2D array, we can say we have an undirected graph.

I am trying to find the shortest distance from one particular node to all nodes. In this case say from node 0 to all the nodes that exist.

I have code that works but I think I am re-computing things which I want to avoid. I call the function computeDistPerNode(m,0,key); again and again and I am sure it is doing re-computation of distance from 0 to nodes that it has seen in prior calls. I am unable to optimize it and leverage the past computation. How do I do it?

Here is the working code without optimization

    public Map&lt;Integer, List&lt;Integer&gt;&gt; createUnDirectedGraph(int [][]edges) {
Map&lt;Integer, List&lt;Integer&gt;&gt; m = new HashMap&lt;&gt;();
for(var i = 0; i&lt;edges.length; i++) {
m.put(edges[i][0], new ArrayList&lt;&gt;());
m.put(edges[i][1], new ArrayList&lt;&gt;());
}
for(var edge:edges) {
var v1 = edge[0];
var v2 = edge[1];
m.get(v1).add(v2);
m.get(v2).add(v1);
}
return m;
}
public int[] getShortestDistances(Map&lt;Integer, List&lt;Integer&gt;&gt; m) {
int distance[] = new int[m.size()];
for(Integer key:m.keySet()) {
var d = computeDistPerNode(m,0,key);
distance[key] = d;
}
return distance;
}
public int computeDistPerNode(Map&lt;Integer, List&lt;Integer&gt;&gt; m, int src, int dest) {
Queue&lt;Integer&gt; q = new LinkedList&lt;&gt;();
Integer dist[] = new Integer[m.size()];
Set&lt;Integer&gt; visited = new HashSet&lt;&gt;();
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
if(visited.contains(currNode)) continue;
visited.add(currNode);
if(currNode == dest) {
return dist[dest];
}
for(var child: m.get(currNode)) {
if (visited.contains(child)) {
continue;
}
q.offer(child);
var newDist = 1 + dist[currNode];
if(newDist&lt;dist[child]) {
dist[child] = newDist;
}
}
}
return -1;
}
public int[][] getsample() {
int [][] edges = {
{0,1},
{0,2},
{1,4},
{2,3},
{4,3},
{0,4},
};
return edges;
}

答案1

得分: 3

你可以一次性计算从源节点到所有其他节点的距离。

方法 `int computeDistPerNode(Map&lt;Integer, List&lt;Integer&gt;&gt; m, int src, int dest)` 在到达目标节点时立即返回。将其改为在队列为空时返回 `dist` 数组。以下是修改后的方法:

```java
public Integer[] computeDistFromSource(Map&lt;Integer, List&lt;Integer&gt;&gt; m, int src) {
    Set&lt;Integer&gt; visited = new HashSet&lt;&gt;();

    Integer[] dist = new Integer[m.size()];
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;

    Queue&lt;Integer&gt; q = new LinkedList&lt;&gt;();
    visited.add(src);   // 在此处标记源节点为已访问
    q.add(src);

    while(!q.isEmpty()) {
        var currNode = q.poll();

        for(var child: m.get(currNode)) {
            if (!visited.contains(child)) {
                visited.add(child);
                q.offer(child);
                dist[child] = 1 + dist[currNode];
            }
        }
    }

    return dist;
}

改进

如果稍微调整代码,可以避免三次 if 调用。这将导致代码更干净、更易读。

public Integer[] computeDistFromSource(Map&lt;Integer, List&lt;Integer&gt;&gt; m, int src) {
    Set&lt;Integer&gt; visited = new HashSet&lt;&gt;();

    Integer[] dist = new Integer[m.size()];
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;

    Queue&lt;Integer&gt; q = new LinkedList&lt;&gt;();
    visited.add(src);   // 在此处标记源节点为已访问
    q.add(src);

    while(!q.isEmpty()) {
        var currNode = q.poll();

        for(var child: m.get(currNode)) {
            if (!visited.contains(child)) {
                visited.add(child);
                q.offer(child);
                dist[child] = 1 + dist[currNode];
            }
        }
    }

    return dist;
}

分析

所使用的算法是广度优先搜索。根据Wikipedia

> 时间复杂度可以表示为 O(|V| + |E|),因为在最坏情况下将探索每个顶点和每条边。|V| 是顶点数,|E| 是图中的边数。请注意,O(|E|) 可能在 O(1)O(|V|^2) 之间变化,这取决于输入图的稀疏程度。

问题

> 你能帮我理解为什么如果不进行检查,可能会导致 newDist 的较大值不会写入当前的 dist[child] 吗?我认为原因是由于BFS/使用队列的性质,当一个未访问的节点被拉出时,子节点会首先被访问,因此不需要进行检查?

在你的代码中,if(newDist &lt; dist[child]) 条件是必要的,以确保代码的正确工作。在优化后的代码中,这不是必要的。原因在于 visited.add(child) 的位置。在你的代码中,该检查发生在从队列中获取节点之后。在优化后的代码中,这在发现节点后立即发生。这造成了很大的差异。

考虑你的输入图

0 ------- 1
|\        |
|  \      |
|    \    | 
|      \  |
|        \|
|         4
|         |
|         |
|         |
2 ------- 3
你的代码的工作原理

源顶点是 0。在 while (!q.isEmpty() 循环开始之前,我们将其添加到队列中。

while 循环中,我们移除 0 并将其标记为已访问。我们按顺序探索其邻居 1、2 和 4。我们将它们的距离更新为 1,并将它们全部添加到队列中。但是它们中没有一个被标记为已访问。

现在我们回到 while 循环的开始,获取 1 并将其标记为已访问。然后再次探索其邻居 0 和 4。我们不会更新 0 的距离,因为它已经被访问过了。我们再次将 4 添加到队列中,即使它已经是队列的一部分。 我们再次将相同的节点添加到队列中,这本身就不是一个好事情。请注意,如果没有 if(newDist &lt; dist[child]) 条件,它的距离将被错误地更新为 2。

优化后代码的工作原理

源顶点是 0。在 while (!q.isEmpty() 循环开始之前,我们将其添加到队列中并在此处标记为已访问。

while 循环中,我们移除 0。我们按顺序探索其邻居 1、2 和 4。我们将它们的距离更新为 1,并将它们全部添加到队列中并标记为已访问。因此它们的距离永远不会再次被更新。

现在我们回到 while 循环的开始,获取 1 并再次探索其邻居 0 和 4。我们不会更新 0 和 1 的距离,因为它们都已经被访问过了。节点 4 也不会被添加到队列中两次。


<details>
<summary>英文:</summary>
You can calculate distance from the source node to all the other nodes in one go. 
The method `int computeDistPerNode(Map&lt;Integer, List&lt;Integer&gt;&gt; m, int src, int dest)` returns as soon as you reach the destination node. Change that to return the `dist` array when the queue is empty. Here is your modified method

public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
Set<Integer> visited = new HashSet<>();

Integer[] dist = new Integer[m.size()];
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
Queue&lt;Integer&gt; q = new LinkedList&lt;&gt;();
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
if(visited.contains(currNode)) continue;
visited.add(currNode);
for(var child: m.get(currNode)) {
if (visited.contains(child)) continue;
q.offer(child);
var newDist = 1 + dist[currNode];
if(newDist &lt; dist[child]) {
dist[child] = newDist;
}
}
}
return dist;

}


## Improvements
If you re-position your lines a little, you can avoid three if calls. This results in a more clean and readable code.

public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
Set<Integer> visited = new HashSet<>();

Integer[] dist = new Integer[m.size()];
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
Queue&lt;Integer&gt; q = new LinkedList&lt;&gt;();
visited.add(src);   // mark source visited here
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
for(var child: m.get(currNode)) {
if (!visited.contains(child)) {
visited.add(child);
q.offer(child);
dist[child] = 1 + dist[currNode];
}
}
}
return dist;

}


## Analysis
The algorithm employed is [Breadth-first search](https://en.wikipedia.org/wiki/Breadth-first_search). According to [Wikipedia](https://en.wikipedia.org/wiki/Breadth-first_search#Time_and_space_complexity)
&gt; The time complexity can be expressed as `O(|V| + |E|)`, since every vertex and every edge will be explored in the worst case. `|V|` is the number of vertices and `|E|` is the number of edges in the graph. Note that `O(|E|)` may vary between `O(1)` and `O(|V|^2)`, depending on how sparse the input graph is.
## Question
&gt; Can you help me understand how a larger value of newDist might not get written in current dist[child] without that check? I think the reason is that a child due to the nature of BFS/using queue will be visited first when an univisited node is pulled out and hence the check is not required?
The `if(newDist &lt; dist[child])` condition is necessary in your code for correct working. It is not required in the optimized code. The reason is the placement of `visited.add(child)`. In your code, that check happens after a node is polled from queue. In the optimized code, this happens immediately after a node is discovered. This creates a big difference.
Consider your input graph

0 ------- 1
|\ |
| \ |
| \ |
| \ |
| |
| 4
| |
| |
| |
2 ------- 3


##### Working of your code
The source vertex is 0. Before the beginning of the loop `while (!q.isEmpty()` we add it to the queue.
In the while loop, we remove 0 and mark it as visited. We explore its neighbors 1, 2 and 4 in that order. We update their distance to 1 and add all of them to the queue. *However, none of them have been marked as visited.* 
Now we go back to the start of the while loop, poll 1, mark it as visited and again explore its neighbors 0 and 4. We do not update the distance of 0 since it is visited. *We add 4 to the queue again even though it is already part of the queue.* We have added the same node in the queue again this is not a good thing in itself. *Notice if there is no `if(newDist &lt; dist[child])` condition, its distance will be updated to 2 which is wrong.*
##### Working of the optimized code
The source vertex is 0. Before the beginning of the loop `while (!q.isEmpty()` we add it to queue and mark it as visited here only.
In the while loop, we remove 0. We explore its neighbors 1, 2 and 4 in that order. We update their distance to 1 and add all of them to the queue and mark all of them as visited. *Hence their distance can never be updated again.*
Now we go back to the start of the while loop, poll 1 and again explore its neighbors 0 and 4. We do not update the distance of 0 as well as 1 since both of them are visited. The node 4 is also not added to the queue twice.
</details>
# 答案2
**得分**: 1
如果您使用`min-priority-queue`或`min-heap`,您可以将算法复杂度降低到`O(|V| * |E|)`,即顶点数和边数的乘积。即使在从@AKSingh的[答案](https://stackoverflow.com/a/76298775/1202808)中改进了您的算法之后,我认为它仍然是`O(|V|^2)`。
维基百科对[Dijkstra算法](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm)有一个很好的描述,这是解决使用`min-priority-queue`解决最短路径问题的标准技术。[这里](https://takeuforward.org/data-structure/dijkstras-algorithm-using-priority-queue-g-32/)有一个更教程导向的描述,其中包含很多图示,以可视化算法。
以下是实现该算法的一些示例代码。我很抱歉它不是用`Java`编写的,但翻译应该很简单。
示例代码
-----------
```c++
#include &lt;iostream&gt;
#include &lt;map&gt;
#include &lt;queue&gt;
#include &lt;set&gt;
#include &lt;vector&gt;
using NodePair = std::pair&lt;int,int&gt;;
using NodePairs = std::vector&lt;NodePair&gt;;
using DistanceVertex = std::pair&lt;int, int&gt;;
using MinQueue = std::priority_queue&lt;DistanceVertex,
std::vector&lt;DistanceVertex&gt;,
std::greater&lt;DistanceVertex&gt;&gt;;
int main(int argc, const char *argv[]) {
// 示例问题。我们将图存储为邻接列表
// 使用multimap。
std::multimap&lt;int, int&gt; edges {
{ 0, 1 },
{ 0, 2 },
{ 1, 4 },
{ 2, 3 },
{ 4, 3 },
{ 0, 4 }
};
// 有多少个顶点?
int max_vertex{};
for (auto [a, b] : edges) {
max_vertex = std::max(max_vertex, a);
max_vertex = std::max(max_vertex, b);
}
int number_vertices = max_vertex + 1;
// 将源到每个顶点的距离初始化为MAX_INT。
int source{};
std::vector&lt;int&gt; distance(number_vertices, std::numeric_limits&lt;int&gt;::max());
// 初始化到源的距离和优先队列
MinQueue pq;
distance[source] = 0;
pq.emplace(0, source);
while (!pq.empty()) {
auto [udist, udx] = pq.top();
pq.pop();
// 遍历vdx的所有邻居
auto [begin, end] = edges.equal_range(udx);
for (auto iter = begin; iter != end; ++iter) {
auto vdx = iter-&gt;second, vdist = iter-&gt;first;
// 如果存在更短的路径,则记录它
if (udist + vdist &lt; distance[vdx]) {
distance[vdx] = udist + vdist;
pq.push({udist, vdx});
}
}
}
// distance现在包含源和每个节点之间的最短距离
for (auto i = 0; i &lt; number_vertices; ++i)
std::cout &lt;&lt; distance[i] &lt;&lt; std::endl;
return 0;
}
英文:

If you use a min-priority-queue or min-heap, you can reduce the algorithmic complexity to O(|V| * |E|), i.e. the produce of the number of vertices and number of edges. Even with the improvements to your algorithm from @AKSingh's answer, I think it is still O(|V|^2).

Wikipedia has is a good description of Dijkstra's algorithm which is the standard technique for solving the min-path problem with a min-priority-queue. Here is a more tutorial oriented description with a lot of figures to visualize the algorithm.

The following is some sample code that implements the algorithm. I apologize that it is not in Java, but the translation should be straight forward.

Sample Code

#include &lt;iostream&gt;
#include &lt;map&gt;
#include &lt;queue&gt;
#include &lt;set&gt;
#include &lt;vector&gt;

using NodePair = std::pair&lt;int,int&gt;;
using NodePairs = std::vector&lt;NodePair&gt;;

using DistanceVertex = std::pair&lt;int, int&gt;;
using MinQueue = std::priority_queue&lt;DistanceVertex,
                                  std::vector&lt;DistanceVertex&gt;,
                                  std::greater&lt;DistanceVertex&gt;&gt;;

int main(int argc, const char *argv[]) {
    // The sample problem. We store the graph as a adjacency list
    // using a multimap.
    std::multimap&lt;int, int&gt; edges {
        { 0, 1 },
        { 0, 2 },
        { 1, 4 },
        { 2, 3 },
        { 4, 3 },
        { 0, 4 }
    };

    // How many vertices?
    int max_vertex{};
    for (auto [a, b] : edges) {
        max_vertex = std::max(max_vertex, a);
        max_vertex = std::max(max_vertex, b);
    }
    int number_vertices = max_vertex + 1;

    // Initialize the distance from source to each vertex as MAX_INT.
    int source{};
    std::vector&lt;int&gt; distance(number_vertices, std::numeric_limits&lt;int&gt;::max());

    // Initialize distance to source and priority queue
    MinQueue pq;
    distance[source] = 0;
    pq.emplace(0, source);

    while (!pq.empty()) {
        auto [udist, udx] = pq.top();
        pq.pop();

        // Iterate over all neighbors of vdx
        auto [begin, end] = edges.equal_range(udx);
        for (auto iter = begin; iter != end; ++iter) {
            auto vdx = iter-&gt;second, vdist = iter-&gt;first;

            // If there is a shorter path, record it
            if (udist + vdist &lt; distance[vdx]) {
                distance[vdx] = udist + vdist;
                pq.push({udist, vdx});
            }
        }
    }

    // distance now contains the shortest distance between source and each node
    for (auto i = 0; i &lt; number_vertices; ++i)
        std::cout &lt;&lt; distance[i] &lt;&lt; std::endl;

    return 0;
}

huangapple
  • 本文由 发表于 2023年5月21日 10:48:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76298093.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定