英文:
Is there a way to find semi-connected( unilaterally connected ) components in a directed graph with networkX?
问题
在标题中提到的,是否有一种方法可以在使用networkX的有向图中找到半连接(单向连接)的组件?我已经阅读了networkX的文档,但只找到了is_semiconnected来检查有向图是否连接,我没有看到其他选项(我有遗漏什么吗?)。我想要的是类似nx.connected_components的功能。例如,我有这样的图:
我希望结果是:[A,B,C], [A,E,F], [X] 和 [Y]。输出数据应该按顺序排列,因为在我的情况下,A->B表示A在B之前发生,B->C表示B在C之前发生,所以我希望保持图中的顺序不变。是否有一种方法可以做到这一点?如果一个节点与任何其他节点都没有连接,它也应该被视为半连接(对于我的情况是这样)。就像在示例中,如果我有另一个名为X的独立节点,那么X也应该包含在最终答案中。我知道这在有向图中有点奇怪,但我需要它来处理我的数据和输出。这有点像查找最大的半连接组件。
另外,我可以确保我的图中不会有循环。
英文:
I don't know if this is a weird question. Like in the title, is there a way to find semi-connected( unilaterally connected ) components in a directed graph with networkX? I do read the document of networkX but it just has is_semiconnected to check if a di-graph is connected or not, i don't see any other option( am I missing anything?). What I want is something like nx.connected_components. For example I have a graph like this :
I want the result to be : [A,B,C], [A,E,F], [X] and [Y]. The output data should be in the order, because in my case, A->B mean A happen before B,B->C mean B happen before C, so I want to keep the order as it is in the graph. Is there a way to do this? If a node is not connected to any other node, it should be considered semi-connected too( for my case that i'm working ). Like in the example, if I have another standalone node called X, then X should be in the final answer too. I know this is kinda weird for directed graph, but need it for my data and output. It's like finding maximal semi-connected components.
Also i can make sure that my graph won't have cycles in it.
答案1
得分: 2
以下是已经翻译好的部分:
这是一个有效利用 NetworkX 的算法。
- 计算 G 的强连通分量并将它们合并,形成一个有向无环图。
- 对DAG进行传递性减少。
- 枚举DAG中的所有最大路径。(这一步的技术措辞使前一步变得多余,但在传递性减少之后,我们只需枚举从源到汇的所有路径。)
- 对每个路径,将其扩展回到G中对应的顶点集。
Python 实现:
import itertools
import networkx
def semiconnected_components(G):
G0 = networkx.quotient_graph(G, list(networkx.strongly_connected_components(G)))
G1 = networkx.transitive_reduction(G0)
sources = {v for v in G1.nodes() if G1.in_degree(v) == 0}
sinks = {v for v in G1.nodes() if G1.out_degree(v) == 0}
for source in sources:
for path in networkx.all_simple_paths(G1, source, sinks):
yield list(itertools.chain(*map(sorted, path)))
# Work around a bug in all_simple_paths (does not return length-0 paths, see
# https://github.com/networkx/networkx/issues/6690).
yield from map(list, sources & sinks)
def main():
G = networkx.DiGraph()
G.add_nodes_from({"X"})
G.add_edges_from({"AB", "BC", "AE", "EF"})
for S in semiconnected_components(G):
print(S)
print()
G.add_edges_from({"CD", "DC", "CF"})
for S in semiconnected_components(G):
print(S)
if __name__ == "__main__":
main()
示例输出:
['A', 'B', 'C']
['A', 'E', 'F']
['X']
['A', 'B', 'C', 'D', 'F']
['A', 'E', 'F']
['X']
<details>
<summary>英文:</summary>
Here’s an algorithm that makes effective use of NetworkX.
1. Compute the strong components of G and contract them, forming a directed acyclic graph.
2. Transitively reduce the DAG.
3. Enumerate all maximal paths in the DAG. (The technical wording of this step makes the previous step redundant, but after the transitive reduction, we can just enumerate all paths from a source to a sink.)
4. For each path, expand it back to the corresponding set of vertices in G.
Python implementation:
``` python
import itertools
import networkx
def semiconnected_components(G):
G0 = networkx.quotient_graph(G, list(networkx.strongly_connected_components(G)))
G1 = networkx.transitive_reduction(G0)
sources = {v for v in G1.nodes() if G1.in_degree(v) == 0}
sinks = {v for v in G1.nodes() if G1.out_degree(v) == 0}
for source in sources:
for path in networkx.all_simple_paths(G1, source, sinks):
yield list(itertools.chain(*map(sorted, path)))
# Work around a bug in all_simple_paths (does not return length-0 paths, see
# https://github.com/networkx/networkx/issues/6690).
yield from map(list, sources & sinks)
def main():
G = networkx.DiGraph()
G.add_nodes_from({"X"})
G.add_edges_from({"AB", "BC", "AE", "EF"})
for S in semiconnected_components(G):
print(S)
print()
G.add_edges_from({"CD", "DC", "CF"})
for S in semiconnected_components(G):
print(S)
if __name__ == "__main__":
main()
Sample output:
['A', 'B', 'C']
['A', 'E', 'F']
['X']
['A', 'B', 'C', 'D', 'F']
['A', 'E', 'F']
['X']
答案2
得分: 1
这是一个枚举所有最大半连通子图的暴力函数。
这里使用一个名为 maximal_subsets(seq, pred)
的函数来枚举一个序列 seq
中满足谓词 pred
的所有最大子集。
你可以在这个相关问题的答案中找到对该函数的解释:
from itertools import chain, combinations
from collections import defaultdict
from networkx import is_semiconnected
def powerset_decreasing(iterable):
# "powerset_decreasing([1,2,3]) --> (1,2,3) (1,2) (1,3) (2,3) (1,) (2,) (3,)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s), 0, -1))
def maximal_subsets(seq, pred):
cache = defaultdict(set)
yielded_something = False
for subset in powerset_decreasing(seq):
supersets = set.intersection(*(cache[x] for x in subset))
if (not supersets) and pred(subset):
yield subset
yielded_something = True
for x in subset:
cache[x].add(subset)
if (not yielded_something) and pred(()):
yield ()
def maximal_semiconnected_subgraphs(G):
def pred(node_subset):
return is_semiconnected(G.subgraph(node_subset))
yield from maximal_subsets(G.nodes, pred)
测试:
## A -> B \ /> F -> G
## > E
## C -> D / \> H -> I
from networkx import DiGraph
G = DiGraph()
G.add_edges_from('AB BE CD DE EF FG EH HI'.split())
components = [''.join(h) for h in maximal_semiconnected_subgraphs(G)]
print("COMPONENTS: ", components)
# COMPONENTS: ['ABEFG', 'ABEHI', 'ECDFG', 'ECDHI']
英文:
Here is a bruteforce function which will yield all maximally semiconnected subgraphs.
This uses a function maximal_subsets(seq, pred)
to yield all maximal subsets of a sequence seq
for a predicate pred
.
You can find an explanation for that function in an answer to this related question:
from itertools import chain, combinations
from collections import defaultdict
from networkx import is_semiconnected
def powerset_decreasing(iterable):
"powerset_decreasing([1,2,3]) --> (1,2,3) (1,2) (1,3) (2,3) (1,) (2,) (3,)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s), 0, -1))
def maximal_subsets(seq, pred):
cache = defaultdict(set)
yielded_something = False
for subset in powerset_decreasing(seq):
supersets = set.intersection(*(cache[x] for x in subset))
if (not supersets) and pred(subset):
yield subset
yielded_something = True
for x in subset:
cache[x].add(subset)
if (not yielded_something) and pred(()):
yield ()
def maximal_semiconnected_subgraphs(G):
def pred(node_subset):
return is_semiconnected(G.subgraph(node_subset))
yield from maximal_subsets(G.nodes, pred)
Testing:
## A -> B \ /> F -> G
## > E
## C -> D / \> H -> I
from networkx import DiGraph
G = DiGraph()
G.add_edges_from('AB BE CD DE EF FG EH HI'.split())
components = [''.join(h) for h in maximal_semiconnected_subgraphs(G)]
print("COMPONENTS: ", components)
# COMPONENTS: ['ABEFG', 'ABEHI', 'ECDFG', 'ECDHI']
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论