英文:
Get shortest distance from src to all nodes optimization
问题
以下是您要翻译的内容:
我有一个输入作为[][]edges
。数组的列长度为2。因此,2D数组的每一行都有2个元素。每个元素都是一个顶点。它是双向的,即我们可以说边是双向的。因此,如果我们遍历这个2D数组,我们可以说我们有一个无向图。
我试图找到从一个
特定节点到所有节点的最短距离。在这种情况下,从节点0
到所有现有节点。
我有一个有效的代码,但我认为我在重新计算我想要避免的东西。我一次又一次地调用函数computeDistPerNode(m,0,key);
,我确信它正在重新计算从0
到之前调用中看到的节点的距离。我无法优化它并利用过去的计算。我该如何做?
以下是没有优化的工作代码
public Map<Integer, List<Integer>> createUnDirectedGraph(int [][]edges) {
Map<Integer, List<Integer>> m = new HashMap<>();
for(var i = 0; i<edges.length; i++) {
m.put(edges[i][0], new ArrayList<>());
m.put(edges[i][1], new ArrayList<>());
}
for(var edge:edges) {
var v1 = edge[0];
var v2 = edge[1];
m.get(v1).add(v2);
m.get(v2).add(v1);
}
return m;
}
public int[] getShortestDistances(Map<Integer, List<Integer>> m) {
int distance[] = new int[m.size()];
for(Integer key:m.keySet()) {
var d = computeDistPerNode(m,0,key);
distance[key] = d;
}
return distance;
}
public int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest) {
Queue<Integer> q = new LinkedList<>();
Integer dist[] = new Integer[m.size()];
Set<Integer> visited = new HashSet<>();
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
if(visited.contains(currNode)) continue;
visited.add(currNode);
if(currNode == dest) {
return dist[dest];
}
for(var child: m.get(currNode)) {
if (visited.contains(child)) {
continue;
}
q.offer(child);
var newDist = 1 + dist[currNode];
if(newDist<dist[child]) {
dist[child] = newDist;
}
}
}
return -1;
}
public int[][] getsample() {
int [][] edges = {
{0,1},
{0,2},
{1,4},
{2,3},
{4,3},
{0,4},
};
return edges;
}
英文:
I have an input as [][]edges
. The col length of array is 2. Each row of the 2D array hence has 2 elements. Each element is a vertex. And it is bidirectional i.e we can say the edge is in both directions. Hence if we go through this 2D array, we can say we have an undirected graph.
I am trying to find the shortest distance from one
particular node to all nodes. In this case say from node 0
to all the nodes that exist.
I have code that works but I think I am re-computing things which I want to avoid. I call the function computeDistPerNode(m,0,key);
again and again and I am sure it is doing re-computation of distance from 0
to nodes that it has seen in prior calls. I am unable to optimize it and leverage the past computation. How do I do it?
Here is the working code without optimization
public Map<Integer, List<Integer>> createUnDirectedGraph(int [][]edges) {
Map<Integer, List<Integer>> m = new HashMap<>();
for(var i = 0; i<edges.length; i++) {
m.put(edges[i][0], new ArrayList<>());
m.put(edges[i][1], new ArrayList<>());
}
for(var edge:edges) {
var v1 = edge[0];
var v2 = edge[1];
m.get(v1).add(v2);
m.get(v2).add(v1);
}
return m;
}
public int[] getShortestDistances(Map<Integer, List<Integer>> m) {
int distance[] = new int[m.size()];
for(Integer key:m.keySet()) {
var d = computeDistPerNode(m,0,key);
distance[key] = d;
}
return distance;
}
public int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest) {
Queue<Integer> q = new LinkedList<>();
Integer dist[] = new Integer[m.size()];
Set<Integer> visited = new HashSet<>();
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
if(visited.contains(currNode)) continue;
visited.add(currNode);
if(currNode == dest) {
return dist[dest];
}
for(var child: m.get(currNode)) {
if (visited.contains(child)) {
continue;
}
q.offer(child);
var newDist = 1 + dist[currNode];
if(newDist<dist[child]) {
dist[child] = newDist;
}
}
}
return -1;
}
public int[][] getsample() {
int [][] edges = {
{0,1},
{0,2},
{1,4},
{2,3},
{4,3},
{0,4},
};
return edges;
}
答案1
得分: 3
你可以一次性计算从源节点到所有其他节点的距离。
方法 `int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest)` 在到达目标节点时立即返回。将其改为在队列为空时返回 `dist` 数组。以下是修改后的方法:
```java
public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
Set<Integer> visited = new HashSet<>();
Integer[] dist = new Integer[m.size()];
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
Queue<Integer> q = new LinkedList<>();
visited.add(src); // 在此处标记源节点为已访问
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
for(var child: m.get(currNode)) {
if (!visited.contains(child)) {
visited.add(child);
q.offer(child);
dist[child] = 1 + dist[currNode];
}
}
}
return dist;
}
改进
如果稍微调整代码,可以避免三次 if 调用。这将导致代码更干净、更易读。
public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
Set<Integer> visited = new HashSet<>();
Integer[] dist = new Integer[m.size()];
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
Queue<Integer> q = new LinkedList<>();
visited.add(src); // 在此处标记源节点为已访问
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
for(var child: m.get(currNode)) {
if (!visited.contains(child)) {
visited.add(child);
q.offer(child);
dist[child] = 1 + dist[currNode];
}
}
}
return dist;
}
分析
> 时间复杂度可以表示为 O(|V| + |E|)
,因为在最坏情况下将探索每个顶点和每条边。|V|
是顶点数,|E|
是图中的边数。请注意,O(|E|)
可能在 O(1)
和 O(|V|^2)
之间变化,这取决于输入图的稀疏程度。
问题
> 你能帮我理解为什么如果不进行检查,可能会导致 newDist
的较大值不会写入当前的 dist[child]
吗?我认为原因是由于BFS/使用队列的性质,当一个未访问的节点被拉出时,子节点会首先被访问,因此不需要进行检查?
在你的代码中,if(newDist < dist[child])
条件是必要的,以确保代码的正确工作。在优化后的代码中,这不是必要的。原因在于 visited.add(child)
的位置。在你的代码中,该检查发生在从队列中获取节点之后。在优化后的代码中,这在发现节点后立即发生。这造成了很大的差异。
考虑你的输入图
0 ------- 1
|\ |
| \ |
| \ |
| \ |
| \|
| 4
| |
| |
| |
2 ------- 3
你的代码的工作原理
源顶点是 0。在 while (!q.isEmpty()
循环开始之前,我们将其添加到队列中。
在 while
循环中,我们移除 0 并将其标记为已访问。我们按顺序探索其邻居 1、2 和 4。我们将它们的距离更新为 1,并将它们全部添加到队列中。但是它们中没有一个被标记为已访问。
现在我们回到 while
循环的开始,获取 1 并将其标记为已访问。然后再次探索其邻居 0 和 4。我们不会更新 0 的距离,因为它已经被访问过了。我们再次将 4 添加到队列中,即使它已经是队列的一部分。 我们再次将相同的节点添加到队列中,这本身就不是一个好事情。请注意,如果没有 if(newDist < dist[child])
条件,它的距离将被错误地更新为 2。
优化后代码的工作原理
源顶点是 0。在 while (!q.isEmpty()
循环开始之前,我们将其添加到队列中并在此处标记为已访问。
在 while
循环中,我们移除 0。我们按顺序探索其邻居 1、2 和 4。我们将它们的距离更新为 1,并将它们全部添加到队列中并标记为已访问。因此它们的距离永远不会再次被更新。
现在我们回到 while
循环的开始,获取 1 并再次探索其邻居 0 和 4。我们不会更新 0 和 1 的距离,因为它们都已经被访问过了。节点 4 也不会被添加到队列中两次。
<details>
<summary>英文:</summary>
You can calculate distance from the source node to all the other nodes in one go.
The method `int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest)` returns as soon as you reach the destination node. Change that to return the `dist` array when the queue is empty. Here is your modified method
public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
Set<Integer> visited = new HashSet<>();
Integer[] dist = new Integer[m.size()];
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
Queue<Integer> q = new LinkedList<>();
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
if(visited.contains(currNode)) continue;
visited.add(currNode);
for(var child: m.get(currNode)) {
if (visited.contains(child)) continue;
q.offer(child);
var newDist = 1 + dist[currNode];
if(newDist < dist[child]) {
dist[child] = newDist;
}
}
}
return dist;
}
## Improvements
If you re-position your lines a little, you can avoid three if calls. This results in a more clean and readable code.
public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
Set<Integer> visited = new HashSet<>();
Integer[] dist = new Integer[m.size()];
Arrays.fill(dist, Integer.MAX_VALUE);
dist[src] = 0;
Queue<Integer> q = new LinkedList<>();
visited.add(src); // mark source visited here
q.add(src);
while(!q.isEmpty()) {
var currNode = q.poll();
for(var child: m.get(currNode)) {
if (!visited.contains(child)) {
visited.add(child);
q.offer(child);
dist[child] = 1 + dist[currNode];
}
}
}
return dist;
}
## Analysis
The algorithm employed is [Breadth-first search](https://en.wikipedia.org/wiki/Breadth-first_search). According to [Wikipedia](https://en.wikipedia.org/wiki/Breadth-first_search#Time_and_space_complexity)
> The time complexity can be expressed as `O(|V| + |E|)`, since every vertex and every edge will be explored in the worst case. `|V|` is the number of vertices and `|E|` is the number of edges in the graph. Note that `O(|E|)` may vary between `O(1)` and `O(|V|^2)`, depending on how sparse the input graph is.
## Question
> Can you help me understand how a larger value of newDist might not get written in current dist[child] without that check? I think the reason is that a child due to the nature of BFS/using queue will be visited first when an univisited node is pulled out and hence the check is not required?
The `if(newDist < dist[child])` condition is necessary in your code for correct working. It is not required in the optimized code. The reason is the placement of `visited.add(child)`. In your code, that check happens after a node is polled from queue. In the optimized code, this happens immediately after a node is discovered. This creates a big difference.
Consider your input graph
0 ------- 1
|\ |
| \ |
| \ |
| \ |
| |
| 4
| |
| |
| |
2 ------- 3
##### Working of your code
The source vertex is 0. Before the beginning of the loop `while (!q.isEmpty()` we add it to the queue.
In the while loop, we remove 0 and mark it as visited. We explore its neighbors 1, 2 and 4 in that order. We update their distance to 1 and add all of them to the queue. *However, none of them have been marked as visited.*
Now we go back to the start of the while loop, poll 1, mark it as visited and again explore its neighbors 0 and 4. We do not update the distance of 0 since it is visited. *We add 4 to the queue again even though it is already part of the queue.* We have added the same node in the queue again this is not a good thing in itself. *Notice if there is no `if(newDist < dist[child])` condition, its distance will be updated to 2 which is wrong.*
##### Working of the optimized code
The source vertex is 0. Before the beginning of the loop `while (!q.isEmpty()` we add it to queue and mark it as visited here only.
In the while loop, we remove 0. We explore its neighbors 1, 2 and 4 in that order. We update their distance to 1 and add all of them to the queue and mark all of them as visited. *Hence their distance can never be updated again.*
Now we go back to the start of the while loop, poll 1 and again explore its neighbors 0 and 4. We do not update the distance of 0 as well as 1 since both of them are visited. The node 4 is also not added to the queue twice.
</details>
# 答案2
**得分**: 1
如果您使用`min-priority-queue`或`min-heap`,您可以将算法复杂度降低到`O(|V| * |E|)`,即顶点数和边数的乘积。即使在从@AKSingh的[答案](https://stackoverflow.com/a/76298775/1202808)中改进了您的算法之后,我认为它仍然是`O(|V|^2)`。
维基百科对[Dijkstra算法](https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm)有一个很好的描述,这是解决使用`min-priority-queue`解决最短路径问题的标准技术。[这里](https://takeuforward.org/data-structure/dijkstras-algorithm-using-priority-queue-g-32/)有一个更教程导向的描述,其中包含很多图示,以可视化算法。
以下是实现该算法的一些示例代码。我很抱歉它不是用`Java`编写的,但翻译应该很简单。
示例代码
-----------
```c++
#include <iostream>
#include <map>
#include <queue>
#include <set>
#include <vector>
using NodePair = std::pair<int,int>;
using NodePairs = std::vector<NodePair>;
using DistanceVertex = std::pair<int, int>;
using MinQueue = std::priority_queue<DistanceVertex,
std::vector<DistanceVertex>,
std::greater<DistanceVertex>>;
int main(int argc, const char *argv[]) {
// 示例问题。我们将图存储为邻接列表
// 使用multimap。
std::multimap<int, int> edges {
{ 0, 1 },
{ 0, 2 },
{ 1, 4 },
{ 2, 3 },
{ 4, 3 },
{ 0, 4 }
};
// 有多少个顶点?
int max_vertex{};
for (auto [a, b] : edges) {
max_vertex = std::max(max_vertex, a);
max_vertex = std::max(max_vertex, b);
}
int number_vertices = max_vertex + 1;
// 将源到每个顶点的距离初始化为MAX_INT。
int source{};
std::vector<int> distance(number_vertices, std::numeric_limits<int>::max());
// 初始化到源的距离和优先队列
MinQueue pq;
distance[source] = 0;
pq.emplace(0, source);
while (!pq.empty()) {
auto [udist, udx] = pq.top();
pq.pop();
// 遍历vdx的所有邻居
auto [begin, end] = edges.equal_range(udx);
for (auto iter = begin; iter != end; ++iter) {
auto vdx = iter->second, vdist = iter->first;
// 如果存在更短的路径,则记录它
if (udist + vdist < distance[vdx]) {
distance[vdx] = udist + vdist;
pq.push({udist, vdx});
}
}
}
// distance现在包含源和每个节点之间的最短距离
for (auto i = 0; i < number_vertices; ++i)
std::cout << distance[i] << std::endl;
return 0;
}
英文:
If you use a min-priority-queue
or min-heap
, you can reduce the algorithmic complexity to O(|V| * |E|)
, i.e. the produce of the number of vertices and number of edges. Even with the improvements to your algorithm from @AKSingh's answer, I think it is still O(|V|^2)
.
Wikipedia has is a good description of Dijkstra's algorithm which is the standard technique for solving the min-path problem with a min-priority-queue
. Here is a more tutorial oriented description with a lot of figures to visualize the algorithm.
The following is some sample code that implements the algorithm. I apologize that it is not in Java
, but the translation should be straight forward.
Sample Code
#include <iostream>
#include <map>
#include <queue>
#include <set>
#include <vector>
using NodePair = std::pair<int,int>;
using NodePairs = std::vector<NodePair>;
using DistanceVertex = std::pair<int, int>;
using MinQueue = std::priority_queue<DistanceVertex,
std::vector<DistanceVertex>,
std::greater<DistanceVertex>>;
int main(int argc, const char *argv[]) {
// The sample problem. We store the graph as a adjacency list
// using a multimap.
std::multimap<int, int> edges {
{ 0, 1 },
{ 0, 2 },
{ 1, 4 },
{ 2, 3 },
{ 4, 3 },
{ 0, 4 }
};
// How many vertices?
int max_vertex{};
for (auto [a, b] : edges) {
max_vertex = std::max(max_vertex, a);
max_vertex = std::max(max_vertex, b);
}
int number_vertices = max_vertex + 1;
// Initialize the distance from source to each vertex as MAX_INT.
int source{};
std::vector<int> distance(number_vertices, std::numeric_limits<int>::max());
// Initialize distance to source and priority queue
MinQueue pq;
distance[source] = 0;
pq.emplace(0, source);
while (!pq.empty()) {
auto [udist, udx] = pq.top();
pq.pop();
// Iterate over all neighbors of vdx
auto [begin, end] = edges.equal_range(udx);
for (auto iter = begin; iter != end; ++iter) {
auto vdx = iter->second, vdist = iter->first;
// If there is a shorter path, record it
if (udist + vdist < distance[vdx]) {
distance[vdx] = udist + vdist;
pq.push({udist, vdx});
}
}
}
// distance now contains the shortest distance between source and each node
for (auto i = 0; i < number_vertices; ++i)
std::cout << distance[i] << std::endl;
return 0;
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论