将桑基图的流量填充设置为最后一个节点。

huangapple go评论64阅读模式
英文:

set sankey flow.fill to last node

问题

在我的示例桑基图中,flow.fill 和 flow.color 是由前一个节点设置的(例如,所有在 time_0 时刻的“黄色”流都具有黄色的填充)。我想根据最后一个节点来给流着色。例如,在 time_1 时刻,所有进入“黄色”的流(黄色-黄色、红色-黄色)都具有黄色的填充,而不是下面所看到的(红色-黄色是红色)。

library(tidyverse)
library(ggsankey)

set.seed(2)

# 标准桑基图

df <- tibble(
  id = seq(1:22168),
  time_0 = c(rep("red", 13309), rep("yellow", 8699), rep("green", 160)),
  time_1 = c(rep("red", 1110), rep("yellow", 3771), rep("green", 8428),
             rep("red", 321), rep("yellow", 1940), rep("green", 6438),
             rep("red", 4), rep("yellow", 26), rep("green", 130))
) %>%
  {. ->> df2} %>%
  mutate(across(starts_with("time"), factor,
                levels = c("green", "yellow", "red")))

df_sankey <- df %>%
  ggsankey::make_long(time_0, time_1)

df_sankey_t <- df_sankey %>%
  dplyr::group_by(x, node) %>%
  tally()

df_sankey <- df_sankey %>%
  left_join(df_sankey_t, by = c("x", "node"))

ggplot(df_sankey,
             aes(x = x, next_x = next_x,
                 node = node, next_node = next_node,
                 fill = factor(node),
                 label = paste0(node, " n=", n))) +
  geom_sankey(flow.alpha = 0.6, node.color = "gray30") +
  geom_sankey_label(size = 3, color = "white", fill = "gray40") +
  scale_fill_manual(values = c("green", "red", "yellow")) +
  theme_sankey(base_size = 18) +
  theme(legend.position = "none",
        plot.title.position = "plot",
        plot.title = element_text(face="bold", size=20),
        plot.subtitle = element_text(size=15)) +
  labs(title = "示例桑基图",
       subtitle = "希望根据最后一个节点来着色 flow.fill 和 flow.color",
       x = NULL)

将桑基图的流量填充设置为最后一个节点。


[1]: https://i.stack.imgur.com/fwz2K.png
英文:

In my example sankey diagram, the flow.fill and flow.color are set by the previous node (e.g., all "yellow" flows at time_0 have yellow fill). I would like to color the flows by the final node. For instance, all flows going into "yellow" at time_1 (yellow-yellow, red-yellow) have yellow fill instead of what you see below (red-yellow is red).

将桑基图的流量填充设置为最后一个节点。

library(tidyverse)
library(ggsankey)
set.seed(2)
# standard sankey
df &lt;- tibble(
id = seq(1:22168),
time_0 = c(rep(&quot;red&quot;, 13309), rep(&quot;yellow&quot;, 8699), rep(&quot;green&quot;, 160)),
time_1 = c(rep(&quot;red&quot;, 1110), rep(&quot;yellow&quot;, 3771), rep(&quot;green&quot;, 8428),
rep(&quot;red&quot;, 321), rep(&quot;yellow&quot;, 1940), rep(&quot;green&quot;, 6438),
rep(&quot;red&quot;, 4), rep(&quot;yellow&quot;, 26), rep(&quot;green&quot;, 130))
) %&gt;%
{. -&gt;&gt; df2} %&gt;%
mutate(across(starts_with(&quot;time&quot;), factor,
levels = c(&quot;green&quot;, &quot;yellow&quot;, &quot;red&quot;)))
df_sankey &lt;- df %&gt;%
ggsankey::make_long(time_0, time_1)
df_sankey_t &lt;- df_sankey %&gt;%
dplyr::group_by(x, node)%&gt;%
tally()
df_sankey &lt;- df_sankey %&gt;%
left_join(df_sankey_t, by = c(&quot;x&quot;, &quot;node&quot;))
ggplot(df_sankey,
aes(x = x, next_x = next_x,
node = node, next_node = next_node,
fill = factor(node),
label = paste0(node,&quot; n=&quot;, n))) +
geom_sankey(flow.alpha = 0.6, node.color = &quot;gray30&quot;) +
geom_sankey_label(size = 3, color = &quot;white&quot;, fill = &quot;gray40&quot;) +
scale_fill_manual(values = c(&quot;green&quot;, &quot;red&quot;, &quot;yellow&quot;)) +
theme_sankey(base_size = 18) +
theme(legend.position = &quot;none&quot;,
plot.title.position = &quot;plot&quot;,
plot.title = element_text(face=&quot;bold&quot;, size=20),
plot.subtitle = element_text(size=15)) +
labs(title = &quot;Example sankey diagram&quot;,
subtitle = &quot;Would like to color flow.fill and flow.color to be based on last node&quot;,
x = NULL)

答案1

得分: 1

你可以使用PantaRhei部分实现,也可以使用grid完全实现,以下是代码的翻译部分:


library(PantaRhei)
library(dplyr)
library(tibble)


df1 <- tibble(
  id = seq(1:22168),
  time_0 = c(rep("red", 13309), rep("yellow", 8699), rep("green", 160)),
  time_1 = c(rep("red", 1110), rep("yellow", 3771), rep("green", 8428),
             rep("red", 321), rep("yellow", 1940), rep("green", 6438),
             rep("red", 4), rep("yellow", 26), rep("green", 130))
) |>
  mutate(across(starts_with("time"), factor,
                levels = c("green", "yellow", "red")))

# 为流程总结数据
# 标题名称特定于 Panta Rhei 用于处理数据
# Panta Rhei 文档使用 'substance' 变量来命名 'substance' 或名称的流程,在这种情况下,我们将使用它来确定填充。
# 根据您的数据,可能有更有效的定义唯一的 'from' 和 'to' 变量的方法。

flows <- 
  df1 |>
  group_by(time_0, time_1) |>
  summarise(quantity = n(), .groups = "drop") |>
  mutate(substance = time_1,
         from = case_when(time_0 == "yellow" ~ "B",
                          time_0 == "red" ~ "A",
                          time_0 == "green" ~ "C"),
         to = case_when(time_1 == "yellow" ~ "D",
                        time_1 == "red" ~ "E",
                        time_1 == "green" ~ "F")) 

# 构建节点数据框
# 设置节点的标签和位置

nodes <- 
  data.frame(ID = c(unique(flows$from), unique(flows$to)),
             label = c(unique(flows$from), unique(flows$to)),
             x = c(rep(1, 3), rep(2, 3)),
             y = c("1", "1.25", "1.5", "C", "B", "A"),
             label_pos = rep(c("left", "right"), each = 3))


colors <- tribble(
  ~substance, ~color,
  "yellow",    "yellow",
  "red",    "red",
  "green",    "green"
)


sankey(nodes, flows, colors, legend = FALSE)

# PantaRhei 限制
# 虽然您可以更改节点的颜色,但似乎没有一种方法可以单独着色节点
# 虽然单独的节点被标记了,但我找不到一种方法来标记节点列(不确定正确的术语是什么)。
# 无法控制节点数量的格式。
# 不确定如何控制 'to' 节点的顺序。
# 由于 PantaRhei 是基于 grid 构建的,您可能可以将这些功能添加到您的模型中。

# 基于 'grid' 的编辑如下:

library(grid)
# 检查 grid 树的函数。
# grid.force()
# grid.ls()
# grid.ls(grobs = FALSE, viewports = TRUE)

# 根据检查和一些试验进行编辑...
grid.edit("GRID.polygon.38", gp = gpar(fill="red"))
grid.edit("GRID.polygon.33", gp = gpar(fill="yellow"))
grid.edit("GRID.polygon.29", gp = gpar(fill="green"))

grid.edit("GRID.polygon.17", gp = gpar(fill="red"))
grid.edit("GRID.polygon.21", gp = gpar(fill="yellow"))
grid.edit("GRID.polygon.25", gp = gpar(fill="green"))

# 返回到根视口
popViewport()

# 添加节点列的标签

grid.text(label = c("time_0", "time_1"), x = c(0.25, 0.75), y = rep(0.15, 2))

# 您还可以编辑节点标签
# 我只是编辑了一个作为示例

grid.edit("GRID.text.39", label = "Red")

# 可能有更有效的方法来实现您所期望的效果。
# 我仍在逐渐掌握 grid...!

将桑基图的流量填充设置为最后一个节点。

创建于2023-06-03,使用 reprex v2.0.2

英文:

You can get part of the way using PantaRhei and all the way with inputs from grid


library(PantaRhei)
library(dplyr)
library(tibble)


df1 &lt;- tibble(
  id = seq(1:22168),
  time_0 = c(rep(&quot;red&quot;, 13309), rep(&quot;yellow&quot;, 8699), rep(&quot;green&quot;, 160)),
  time_1 = c(rep(&quot;red&quot;, 1110), rep(&quot;yellow&quot;, 3771), rep(&quot;green&quot;, 8428),
             rep(&quot;red&quot;, 321), rep(&quot;yellow&quot;, 1940), rep(&quot;green&quot;, 6438),
             rep(&quot;red&quot;, 4), rep(&quot;yellow&quot;, 26), rep(&quot;green&quot;, 130))
) |&gt; 
  mutate(across(starts_with(&quot;time&quot;), factor,
                levels = c(&quot;green&quot;, &quot;yellow&quot;, &quot;red&quot;)))

# summarise data for flows
# the heading names are specific to Panta Rhei for processing the data
# Panta Rhei documentation uses the &#39;substance&#39; variable to name the &#39;substance&#39; or name
# of the flow, in this case we&#39;ll use it to determine the fill. 
# There may be more efficient ways to define unique &#39;from&#39; and &#39;to&#39; variables depending on your data.

flows &lt;- 
  df1 |&gt;
  group_by(time_0, time_1) |&gt; 
  summarise(quantity = n(), .groups = &quot;drop&quot;) |&gt; 
  mutate(substance = time_1,
         from = case_when(time_0 == &quot;yellow&quot; ~ &quot;B&quot;,
                          time_0 == &quot;red&quot; ~ &quot;A&quot;,
                          time_0 == &quot;green&quot; ~ &quot;C&quot;),
         to = case_when(time_1 == &quot;yellow&quot; ~ &quot;D&quot;,
                        time_1 == &quot;red&quot; ~ &quot;E&quot;,
                        time_1 == &quot;green&quot; ~ &quot;F&quot;)) 

# build up a nodes data frame
# to set labels and position of nodes

nodes &lt;- 
  data.frame(ID = c(unique(flows$from), unique(flows$to)),
             label = c(unique(flows$from), unique(flows$to)),
             x = c(rep(1, 3), rep(2, 3)),
             y = c(&quot;1&quot;, &quot;1.25&quot;, &quot;1.5&quot;, &quot;C&quot;, &quot;B&quot;, &quot;A&quot;),
             label_pos = rep(c(&quot;left&quot;, &quot;right&quot;), each = 3))


colors &lt;- tribble(
  ~substance, ~color,
  &quot;yellow&quot;,    &quot;yellow&quot;,
  &quot;red&quot;,    &quot;red&quot;,
  &quot;green&quot;,    &quot;green&quot;
)


sankey(nodes, flows, colors, legend = FALSE)

# PantaRhei limitations
# Although you could change the colour of the nodes, there does not seem to be a way colour nodes individually
# While individual nodes are labelled I could not find a way to label the node columns (not sure of the correct term).
# Unable to control the formatting for node quantity.
# Not sure how to control the order of the &#39;to&#39; nodes. 
# As PantaRhei is build from grid you could probably add these features to your model.

# &#39;grid&#39; based edits noted below:

library(grid)
# functions to inspect the grid tree.
# grid.force()
# grid.ls()
# grid.ls(grobs = FALSE, viewports = TRUE)

# make edits following inspection and a bit of trial and error...
grid.edit(&quot;GRID.polygon.38&quot;, gp = gpar(fill=&quot;red&quot;))
grid.edit(&quot;GRID.polygon.33&quot;, gp = gpar(fill=&quot;yellow&quot;))
grid.edit(&quot;GRID.polygon.29&quot;, gp = gpar(fill=&quot;green&quot;))

grid.edit(&quot;GRID.polygon.17&quot;, gp = gpar(fill=&quot;red&quot;))
grid.edit(&quot;GRID.polygon.21&quot;, gp = gpar(fill=&quot;yellow&quot;))
grid.edit(&quot;GRID.polygon.25&quot;, gp = gpar(fill=&quot;green&quot;))

# Get back to the root viewport
popViewport()

# add labels to node columns

grid.text(label = c(&quot;time_0&quot;, &quot;time_1&quot;), x = c(0.25, 0.75), y = rep(0.15, 2))

# you could also edit the node labels
# I&#39;ve just edited one as an example

grid.edit(&quot;GRID.text.39&quot;, label = &quot;Red&quot;)

# There may be more efficient ways to achieve the effect you desire.
# I&#39;m still getting to grips with grid...!

将桑基图的流量填充设置为最后一个节点。

<sup>Created on 2023-06-03 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年6月2日 01:26:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76384319.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定