通过将来自/到两个节点的合并边权重相加,合并一个边列表的数据框。

huangapple go评论90阅读模式
英文:

Contract a dataframe of an edge list by summing the contracted edge weights from/to two nodes

问题

Here's the translated code part:

  1. 我有一个包含两对节点之间边权重数据的数据框 `df`
  2. df <- data.frame(c("A","A","B","B","C","C"),
  3. c("B","C","A","C","A","B"),
  4. c(2,3,6,4,9,1))
  5. colnames(df) <- c("node_from", "node_to", "weight")
  6. print(df)
  7. # 输出:
  8. node_from node_to weight
  9. 1 A B 2
  10. 2 A C 3
  11. 3 B A 6
  12. 4 B C 4
  13. 5 C A 9
  14. 6 C B 1

你想要合并节点 A 和 B 并将它们与其他节点的所有边权重相加,这里只有节点 C。结果应该是一个边列表,A 和 B 之间的边已消失,AB 现在是一个节点:

  1. # 一些代码来合并节点 A 和 B
  2. print(df_contracted)
  3. # 输出:
  4. node_from node_to weight
  5. 1 AB C 7
  6. 3 C AB 10

你是否有方法可以在更大的数据框上高效执行此操作?

我可以将数据框转换为实际的图,使用 igraph 包中的 graph_from_data_framecontract 函数,但考虑到我必须多次执行此操作,我宁愿不必每次都进行转换和重新转换。

英文:

I have a dataframe df that contains data on edge weights between two pairs of nodes:

  1. df &lt;- data.frame(c(&quot;A&quot;,&quot;A&quot;,&quot;B&quot;,&quot;B&quot;,&quot;C&quot;,&quot;C&quot;),
  2. c(&quot;B&quot;,&quot;C&quot;,&quot;A&quot;,&quot;C&quot;,&quot;A&quot;,&quot;B&quot;),
  3. c(2,3,6,4,9,1))
  4. colnames(df) &lt;- c(&quot;node_from&quot;, &quot;node_to&quot;, &quot;weight&quot;)
  5. print(df)
  6. # Output:
  7. node_from node_to weight
  8. 1 A B 2
  9. 2 A C 3
  10. 3 B A 6
  11. 4 B C 4
  12. 5 C A 9
  13. 6 C B 1

I would like to contract this dataframe by merging nodes A and B and summing all edge weights to and from these nodes with any other node, in this case C only. The result should be an edge list where the edges between A and B have disappeared and AB is now one node:

  1. # some code to merge nodes A and B
  2. print(df_contracted)
  3. # Output:
  4. node_from node_to weight
  5. 1 AB C 7
  6. 3 C AB 10

Is there a way to do this efficiently for larger dataframes?

I could convert the dataframe to an actual graph using graph_from_data_frame from the igraph package and then the contract function, but given that I have to do this operation multiple times I'd rather not have to convert it then reconvert it back every time.

答案1

得分: 4

base R 方法

使用基本的 R 语法,我们可以像下面这样使用 aggregate + subset

  1. aggregate(
  2. weight ~ .,
  3. subset(
  4. transform(
  5. df,
  6. node_from = gsub("A|B", "AB", node_from),
  7. node_to = gsub("A|B", "AB", node_to)
  8. ),
  9. node_from != node_to
  10. ),
  11. sum
  12. )

这将得到如下结果:

  1. node_from node_to weight
  2. 1 C AB 10
  3. 2 AB C 7

igraph 方法

这是使用 igraph 中的 contract 函数的方法:

  1. df %>%
  2. graph_from_data_frame() %>%
  3. contract(c(1, 1, 2), function(v) paste0(v, collapse = "")) %>%
  4. simplify() %>%
  5. get.data.frame()

这将得到如下结果:

  1. from to weight
  2. 1 AB C 7
  3. 2 C AB 10
英文:

base R approach

With base R we can use aggregate + subset like below

  1. aggregate(
  2. weight ~ .,
  3. subset(
  4. transform(
  5. df,
  6. node_from = gsub(&quot;A|B&quot;, &quot;AB&quot;, node_from),
  7. node_to = gsub(&quot;A|B&quot;, &quot;AB&quot;, node_to)
  8. ),
  9. node_from != node_to
  10. ),
  11. sum
  12. )

which gives

  1. node_from node_to weight
  2. 1 C AB 10
  3. 2 AB C 7

igraph approach

Here is an option using contract from igraph

  1. df %&gt;%
  2. graph_from_data_frame() %&gt;%
  3. contract(c(1, 1, 2), function(v) paste0(v, collapse = &quot;&quot;)) %&gt;%
  4. simplify() %&gt;%
  5. get.data.frame()

which gives

  1. from to weight
  2. 1 AB C 7
  3. 2 C AB 10

答案2

得分: 2

以下是翻译好的代码部分:

  1. library(dplyr)
  2. to.merge <- c('A', 'B')
  3. merged.name <- paste(to.merge, collapse='')
  4. df %>%
  5. mutate(across(c(node_from, node_to),
  6. ~ if_else(.x %in% to.merge, merged.name, .x))) %>%
  7. group_by(node_from, node_to) %>%
  8. summarise(weight = sum(weight), .groups = "drop") %>%
  9. filter(node_from != node_to)
  10. # # A tibble: 2 × 3
  11. # node_from node_to weight
  12. # <chr> <chr> <dbl>
  13. # 1 AB C 7
  14. # 2 C AB 10

希望这对您有帮助。

英文:

Here's a dplyr solution:

  1. library(dplyr)
  2. to.merge &lt;- c(&#39;A&#39;, &#39;B&#39;)
  3. merged.name &lt;- paste(to.merge, collapse=&#39;&#39;)
  4. df %&gt;%
  5. mutate(across(c(node_from, node_to),
  6. ~ if_else(.x %in% to.merge, merged.name, .x))) %&gt;%
  7. group_by(node_from, node_to) %&gt;%
  8. summarise(weight = sum(weight), .groups = &quot;drop&quot;) %&gt;%
  9. filter(node_from != node_to)
  10. # # A tibble: 2 &#215; 3
  11. # node_from node_to weight
  12. # &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt;
  13. # 1 AB C 7
  14. # 2 C AB 10

It changes all from and to node names that are "A" or "B" to "AB", groups rows with the same combination of from_node and to_node, sums weights within these groups, and finally removes the AB<->AB self-loop.

答案3

得分: 2

You may subset the AB and BA rows away, next summarize by a grepl on &#39;C&#39;, and rbind.

subset(df, rowSums(sapply(df[1:2], grepl, pat='A|B')) != 2) |
{
(.) by(., grepl('C', .$node_from), (x) {
data.frame(t(sapply(x[1:2], (z) paste(unique(z), collapse=''))), weight=sum(x$weight))
})}() |
unname() |
do.call(what='rbind')

node_from node_to weight

1 AB C 7

2 C AB 10

Data:

df <- structure(list(node_from = c("A", "A", "B", "B", "C", "C"), node_to = c("B",
"C", "A", "C", "A", "B"), weight = c(2, 3, 6, 4, 9, 1)), class = "data.frame", row names = c(NA,
-6L))

英文:

You may subset the AB and BA rows away, next summarize by a grepl on &#39;C&#39;, and rbind.

  1. subset(df, rowSums(sapply(df[1:2], grepl, pat=&#39;A|B&#39;)) != 2) |&gt;
  2. {\(.) by(., grepl(&#39;C&#39;, .$node_from), \(x) {
  3. data.frame(t(sapply(x[1:2], \(z) paste(unique(z), collapse=&#39;&#39;))), weight=sum(x$weight))
  4. })}() |&gt; unname() |&gt; do.call(what=&#39;rbind&#39;)
  5. # node_from node_to weight
  6. # 1 AB C 7
  7. # 2 C AB 10

Data:

  1. df &lt;- structure(list(node_from = c(&quot;A&quot;, &quot;A&quot;, &quot;B&quot;, &quot;B&quot;, &quot;C&quot;, &quot;C&quot;), node_to = c(&quot;B&quot;,
  2. &quot;C&quot;, &quot;A&quot;, &quot;C&quot;, &quot;A&quot;, &quot;B&quot;), weight = c(2, 3, 6, 4, 9, 1)), class = &quot;data.frame&quot;, row.names = c(NA,
  3. -6L))

huangapple
  • 本文由 发表于 2023年4月13日 23:52:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/76007448.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定