将桑基图的流量填充设置为最后一个节点。

huangapple go评论115阅读模式
英文:

set sankey flow.fill to last node

问题

在我的示例桑基图中,flow.fill 和 flow.color 是由前一个节点设置的(例如,所有在 time_0 时刻的“黄色”流都具有黄色的填充)。我想根据最后一个节点来给流着色。例如,在 time_1 时刻,所有进入“黄色”的流(黄色-黄色、红色-黄色)都具有黄色的填充,而不是下面所看到的(红色-黄色是红色)。

  1. library(tidyverse)
  2. library(ggsankey)
  3. set.seed(2)
  4. # 标准桑基图
  5. df <- tibble(
  6. id = seq(1:22168),
  7. time_0 = c(rep("red", 13309), rep("yellow", 8699), rep("green", 160)),
  8. time_1 = c(rep("red", 1110), rep("yellow", 3771), rep("green", 8428),
  9. rep("red", 321), rep("yellow", 1940), rep("green", 6438),
  10. rep("red", 4), rep("yellow", 26), rep("green", 130))
  11. ) %>%
  12. {. ->> df2} %>%
  13. mutate(across(starts_with("time"), factor,
  14. levels = c("green", "yellow", "red")))
  15. df_sankey <- df %>%
  16. ggsankey::make_long(time_0, time_1)
  17. df_sankey_t <- df_sankey %>%
  18. dplyr::group_by(x, node) %>%
  19. tally()
  20. df_sankey <- df_sankey %>%
  21. left_join(df_sankey_t, by = c("x", "node"))
  22. ggplot(df_sankey,
  23. aes(x = x, next_x = next_x,
  24. node = node, next_node = next_node,
  25. fill = factor(node),
  26. label = paste0(node, " n=", n))) +
  27. geom_sankey(flow.alpha = 0.6, node.color = "gray30") +
  28. geom_sankey_label(size = 3, color = "white", fill = "gray40") +
  29. scale_fill_manual(values = c("green", "red", "yellow")) +
  30. theme_sankey(base_size = 18) +
  31. theme(legend.position = "none",
  32. plot.title.position = "plot",
  33. plot.title = element_text(face="bold", size=20),
  34. plot.subtitle = element_text(size=15)) +
  35. labs(title = "示例桑基图",
  36. subtitle = "希望根据最后一个节点来着色 flow.fill 和 flow.color",
  37. x = NULL)

将桑基图的流量填充设置为最后一个节点。

  1. [1]: https://i.stack.imgur.com/fwz2K.png
英文:

In my example sankey diagram, the flow.fill and flow.color are set by the previous node (e.g., all "yellow" flows at time_0 have yellow fill). I would like to color the flows by the final node. For instance, all flows going into "yellow" at time_1 (yellow-yellow, red-yellow) have yellow fill instead of what you see below (red-yellow is red).

将桑基图的流量填充设置为最后一个节点。

  1. library(tidyverse)
  2. library(ggsankey)
  3. set.seed(2)
  4. # standard sankey
  5. df &lt;- tibble(
  6. id = seq(1:22168),
  7. time_0 = c(rep(&quot;red&quot;, 13309), rep(&quot;yellow&quot;, 8699), rep(&quot;green&quot;, 160)),
  8. time_1 = c(rep(&quot;red&quot;, 1110), rep(&quot;yellow&quot;, 3771), rep(&quot;green&quot;, 8428),
  9. rep(&quot;red&quot;, 321), rep(&quot;yellow&quot;, 1940), rep(&quot;green&quot;, 6438),
  10. rep(&quot;red&quot;, 4), rep(&quot;yellow&quot;, 26), rep(&quot;green&quot;, 130))
  11. ) %&gt;%
  12. {. -&gt;&gt; df2} %&gt;%
  13. mutate(across(starts_with(&quot;time&quot;), factor,
  14. levels = c(&quot;green&quot;, &quot;yellow&quot;, &quot;red&quot;)))
  15. df_sankey &lt;- df %&gt;%
  16. ggsankey::make_long(time_0, time_1)
  17. df_sankey_t &lt;- df_sankey %&gt;%
  18. dplyr::group_by(x, node)%&gt;%
  19. tally()
  20. df_sankey &lt;- df_sankey %&gt;%
  21. left_join(df_sankey_t, by = c(&quot;x&quot;, &quot;node&quot;))
  22. ggplot(df_sankey,
  23. aes(x = x, next_x = next_x,
  24. node = node, next_node = next_node,
  25. fill = factor(node),
  26. label = paste0(node,&quot; n=&quot;, n))) +
  27. geom_sankey(flow.alpha = 0.6, node.color = &quot;gray30&quot;) +
  28. geom_sankey_label(size = 3, color = &quot;white&quot;, fill = &quot;gray40&quot;) +
  29. scale_fill_manual(values = c(&quot;green&quot;, &quot;red&quot;, &quot;yellow&quot;)) +
  30. theme_sankey(base_size = 18) +
  31. theme(legend.position = &quot;none&quot;,
  32. plot.title.position = &quot;plot&quot;,
  33. plot.title = element_text(face=&quot;bold&quot;, size=20),
  34. plot.subtitle = element_text(size=15)) +
  35. labs(title = &quot;Example sankey diagram&quot;,
  36. subtitle = &quot;Would like to color flow.fill and flow.color to be based on last node&quot;,
  37. x = NULL)

答案1

得分: 1

你可以使用PantaRhei部分实现,也可以使用grid完全实现,以下是代码的翻译部分:

  1. library(PantaRhei)
  2. library(dplyr)
  3. library(tibble)
  4. df1 <- tibble(
  5. id = seq(1:22168),
  6. time_0 = c(rep("red", 13309), rep("yellow", 8699), rep("green", 160)),
  7. time_1 = c(rep("red", 1110), rep("yellow", 3771), rep("green", 8428),
  8. rep("red", 321), rep("yellow", 1940), rep("green", 6438),
  9. rep("red", 4), rep("yellow", 26), rep("green", 130))
  10. ) |>
  11. mutate(across(starts_with("time"), factor,
  12. levels = c("green", "yellow", "red")))
  13. # 为流程总结数据
  14. # 标题名称特定于 Panta Rhei 用于处理数据
  15. # Panta Rhei 文档使用 'substance' 变量来命名 'substance' 或名称的流程,在这种情况下,我们将使用它来确定填充。
  16. # 根据您的数据,可能有更有效的定义唯一的 'from' 和 'to' 变量的方法。
  17. flows <-
  18. df1 |>
  19. group_by(time_0, time_1) |>
  20. summarise(quantity = n(), .groups = "drop") |>
  21. mutate(substance = time_1,
  22. from = case_when(time_0 == "yellow" ~ "B",
  23. time_0 == "red" ~ "A",
  24. time_0 == "green" ~ "C"),
  25. to = case_when(time_1 == "yellow" ~ "D",
  26. time_1 == "red" ~ "E",
  27. time_1 == "green" ~ "F"))
  28. # 构建节点数据框
  29. # 设置节点的标签和位置
  30. nodes <-
  31. data.frame(ID = c(unique(flows$from), unique(flows$to)),
  32. label = c(unique(flows$from), unique(flows$to)),
  33. x = c(rep(1, 3), rep(2, 3)),
  34. y = c("1", "1.25", "1.5", "C", "B", "A"),
  35. label_pos = rep(c("left", "right"), each = 3))
  36. colors <- tribble(
  37. ~substance, ~color,
  38. "yellow", "yellow",
  39. "red", "red",
  40. "green", "green"
  41. )
  42. sankey(nodes, flows, colors, legend = FALSE)
  43. # PantaRhei 限制
  44. # 虽然您可以更改节点的颜色,但似乎没有一种方法可以单独着色节点
  45. # 虽然单独的节点被标记了,但我找不到一种方法来标记节点列(不确定正确的术语是什么)。
  46. # 无法控制节点数量的格式。
  47. # 不确定如何控制 'to' 节点的顺序。
  48. # 由于 PantaRhei 是基于 grid 构建的,您可能可以将这些功能添加到您的模型中。
  49. # 基于 'grid' 的编辑如下:
  50. library(grid)
  51. # 检查 grid 树的函数。
  52. # grid.force()
  53. # grid.ls()
  54. # grid.ls(grobs = FALSE, viewports = TRUE)
  55. # 根据检查和一些试验进行编辑...
  56. grid.edit("GRID.polygon.38", gp = gpar(fill="red"))
  57. grid.edit("GRID.polygon.33", gp = gpar(fill="yellow"))
  58. grid.edit("GRID.polygon.29", gp = gpar(fill="green"))
  59. grid.edit("GRID.polygon.17", gp = gpar(fill="red"))
  60. grid.edit("GRID.polygon.21", gp = gpar(fill="yellow"))
  61. grid.edit("GRID.polygon.25", gp = gpar(fill="green"))
  62. # 返回到根视口
  63. popViewport()
  64. # 添加节点列的标签
  65. grid.text(label = c("time_0", "time_1"), x = c(0.25, 0.75), y = rep(0.15, 2))
  66. # 您还可以编辑节点标签
  67. # 我只是编辑了一个作为示例
  68. grid.edit("GRID.text.39", label = "Red")
  69. # 可能有更有效的方法来实现您所期望的效果。
  70. # 我仍在逐渐掌握 grid...!

将桑基图的流量填充设置为最后一个节点。

创建于2023-06-03,使用 reprex v2.0.2

英文:

You can get part of the way using PantaRhei and all the way with inputs from grid

  1. library(PantaRhei)
  2. library(dplyr)
  3. library(tibble)
  4. df1 &lt;- tibble(
  5. id = seq(1:22168),
  6. time_0 = c(rep(&quot;red&quot;, 13309), rep(&quot;yellow&quot;, 8699), rep(&quot;green&quot;, 160)),
  7. time_1 = c(rep(&quot;red&quot;, 1110), rep(&quot;yellow&quot;, 3771), rep(&quot;green&quot;, 8428),
  8. rep(&quot;red&quot;, 321), rep(&quot;yellow&quot;, 1940), rep(&quot;green&quot;, 6438),
  9. rep(&quot;red&quot;, 4), rep(&quot;yellow&quot;, 26), rep(&quot;green&quot;, 130))
  10. ) |&gt;
  11. mutate(across(starts_with(&quot;time&quot;), factor,
  12. levels = c(&quot;green&quot;, &quot;yellow&quot;, &quot;red&quot;)))
  13. # summarise data for flows
  14. # the heading names are specific to Panta Rhei for processing the data
  15. # Panta Rhei documentation uses the &#39;substance&#39; variable to name the &#39;substance&#39; or name
  16. # of the flow, in this case we&#39;ll use it to determine the fill.
  17. # There may be more efficient ways to define unique &#39;from&#39; and &#39;to&#39; variables depending on your data.
  18. flows &lt;-
  19. df1 |&gt;
  20. group_by(time_0, time_1) |&gt;
  21. summarise(quantity = n(), .groups = &quot;drop&quot;) |&gt;
  22. mutate(substance = time_1,
  23. from = case_when(time_0 == &quot;yellow&quot; ~ &quot;B&quot;,
  24. time_0 == &quot;red&quot; ~ &quot;A&quot;,
  25. time_0 == &quot;green&quot; ~ &quot;C&quot;),
  26. to = case_when(time_1 == &quot;yellow&quot; ~ &quot;D&quot;,
  27. time_1 == &quot;red&quot; ~ &quot;E&quot;,
  28. time_1 == &quot;green&quot; ~ &quot;F&quot;))
  29. # build up a nodes data frame
  30. # to set labels and position of nodes
  31. nodes &lt;-
  32. data.frame(ID = c(unique(flows$from), unique(flows$to)),
  33. label = c(unique(flows$from), unique(flows$to)),
  34. x = c(rep(1, 3), rep(2, 3)),
  35. y = c(&quot;1&quot;, &quot;1.25&quot;, &quot;1.5&quot;, &quot;C&quot;, &quot;B&quot;, &quot;A&quot;),
  36. label_pos = rep(c(&quot;left&quot;, &quot;right&quot;), each = 3))
  37. colors &lt;- tribble(
  38. ~substance, ~color,
  39. &quot;yellow&quot;, &quot;yellow&quot;,
  40. &quot;red&quot;, &quot;red&quot;,
  41. &quot;green&quot;, &quot;green&quot;
  42. )
  43. sankey(nodes, flows, colors, legend = FALSE)
  44. # PantaRhei limitations
  45. # Although you could change the colour of the nodes, there does not seem to be a way colour nodes individually
  46. # While individual nodes are labelled I could not find a way to label the node columns (not sure of the correct term).
  47. # Unable to control the formatting for node quantity.
  48. # Not sure how to control the order of the &#39;to&#39; nodes.
  49. # As PantaRhei is build from grid you could probably add these features to your model.
  50. # &#39;grid&#39; based edits noted below:
  51. library(grid)
  52. # functions to inspect the grid tree.
  53. # grid.force()
  54. # grid.ls()
  55. # grid.ls(grobs = FALSE, viewports = TRUE)
  56. # make edits following inspection and a bit of trial and error...
  57. grid.edit(&quot;GRID.polygon.38&quot;, gp = gpar(fill=&quot;red&quot;))
  58. grid.edit(&quot;GRID.polygon.33&quot;, gp = gpar(fill=&quot;yellow&quot;))
  59. grid.edit(&quot;GRID.polygon.29&quot;, gp = gpar(fill=&quot;green&quot;))
  60. grid.edit(&quot;GRID.polygon.17&quot;, gp = gpar(fill=&quot;red&quot;))
  61. grid.edit(&quot;GRID.polygon.21&quot;, gp = gpar(fill=&quot;yellow&quot;))
  62. grid.edit(&quot;GRID.polygon.25&quot;, gp = gpar(fill=&quot;green&quot;))
  63. # Get back to the root viewport
  64. popViewport()
  65. # add labels to node columns
  66. grid.text(label = c(&quot;time_0&quot;, &quot;time_1&quot;), x = c(0.25, 0.75), y = rep(0.15, 2))
  67. # you could also edit the node labels
  68. # I&#39;ve just edited one as an example
  69. grid.edit(&quot;GRID.text.39&quot;, label = &quot;Red&quot;)
  70. # There may be more efficient ways to achieve the effect you desire.
  71. # I&#39;m still getting to grips with grid...!

将桑基图的流量填充设置为最后一个节点。

<sup>Created on 2023-06-03 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年6月2日 01:26:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76384319.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定