按日期重新排列循环项目

huangapple go评论56阅读模式
英文:

fct_reorder by date on recurring items

问题

尝试创建复杂抗生素疗程的甘特样式图表,有时需要给患者。有时抗生素剂量会更改,然后恢复到原始剂量,或者在稍后日期之前完全更改抗生素,然后再次更改。

试图在下面的 reprex 中捕获问题。

library(tidyverse)
library(lubridate)

df <- tribble(
  ~id,  ~start, ~end, ~antibiotic,
  1,    "02/02/23", "22/02/23", "A 1g",
  2,    "22/02/23", "10/03/23", "A 2g",
  3,    "10/03/23", "15/03/23", "A 1g",
  4,    "28/02/23", "11/03/23", "B 1g",
  5,    "11/03/23", "03/04/23", "B 2g",
  6,    "03/04/23", "10/04/23", "B 1g")

# 尝试根据开始日期重新排列抗生素因子水平,然后将其纵向展开为整洁数据形式
df <- df %>%
  mutate(across(c("start", "end"), dmy),
         antibiotic = fct_reorder(antibiotic, start, .desc = TRUE)) %>%
  pivot_longer(
    cols = c("start", "end"),
    names_to = "start_end",
    values_to = "date"
  )

# 创建甘特样式图
df %>%
  ggplot(aes(x = date)) +
  geom_line(aes(y = antibiotic, colour = antibiotic, group = id), linewidth = 5)

我期望抗生素 B 1g 在抗生素 B 2g 之上显示,因为它的开始日期在 B 2g 之前,就像抗生素 A 一样。

当我检查因子水平时,抗生素 A 似乎是正确的顺序,但抗生素 B 已经交换了顺序:

# 水平应为 "A 1g" "A 2g" "B 1g" "B 2g"
rev(levels(df$antibiotic))
#> [1] "A 1g" "A 2g" "B 2g" "B 1g"

我不明白为什么 fct_reorder 对 A 能够正确执行,但对 B 不能。如果有人能解释这个问题,将不胜感激。

英文:

Trying to make Gantt style chart for complex antibiotic regimens we sometimes have to give patients.

Sometimes the antibiotic dose is changed then put back to original dose, or antibiotic changed completely before then being changed back again at later date.

Tried to capture problem in reprex below.

library(tidyverse)
library(lubridate)

df &lt;- tribble(
  ~id,  ~start, ~end, ~antibiotic,
  1,    &quot;02/02/23&quot;, &quot;22/02/23&quot;, &quot;A 1g&quot;,
  2,    &quot;22/02/23&quot;, &quot;10/03/23&quot;, &quot;A 2g&quot;,
  3,    &quot;10/03/23&quot;, &quot;15/03/23&quot;, &quot;A 1g&quot;,
  4,    &quot;28/02/23&quot;, &quot;11/03/23&quot;, &quot;B 1g&quot;,
  5,    &quot;11/03/23&quot;, &quot;03/04/23&quot;, &quot;B 2g&quot;,
  6,    &quot;03/04/23&quot;, &quot;10/04/23&quot;, &quot;B 1g&quot;)


# trying to reorder the antibiotic factor level based on start date then pivot longer into tidy data form
df &lt;- df%&gt;%
  mutate(across(c(&quot;start&quot;, &quot;end&quot;), dmy),
         antibiotic = fct_reorder(antibiotic, start, .desc = TRUE)) %&gt;%
  pivot_longer(
    cols = c(&quot;start&quot;, &quot;end&quot;),
    names_to = &quot;start_end&quot;,
    values_to = &quot;date&quot;
  )

# create gantt style plot
df %&gt;%
  ggplot(aes(x = date)) +
  geom_line(aes(y = antibiotic, colour = antibiotic, group = id), linewidth = 5)

按日期重新排列循环项目<!-- -->

I was expecting antibiotic B 1g to be diplayed above B 2g, as it has a start date before B 2g, just like antibiotic A.

When I check the factor levels, antibiotic A appears to be in correct order but antibiotic B has swapped around:

# levels should be &quot;A 1g&quot; &quot;A 2g&quot;  &quot;B 1g&quot;  &quot;B 2g&quot;

rev(levels(df$antibiotic))
#&gt; [1] &quot;A 1g&quot; &quot;A 2g&quot; &quot;B 2g&quot; &quot;B 1g&quot;

I can't get my head round why fct_reorder is doing this correctly for A but not B.

Would be grateful if anyone could explain this.

<sup>Created on 2023-04-03 with reprex v2.0.2</sup>

答案1

得分: 0

以下是翻译好的代码部分:

这是因为你对于因子的每个级别都有多个开始日期。`fct_reorder()` 不知道选择哪一个。一个选项是使用 `slice_min()` 并基于显示的第一个日期构建因子级别:

antibiotic_levels <- df %>%
  mutate(across(c("start", "end"), dmy)) %>%
  group_by(antibiotic) %>%
  slice_min(start) %>%
  arrange(start) %>%
  mutate(antibiotic = factor(antibiotic, levels = antibiotic)) %>%
  pull(antibiotic)

df %>%
  mutate(across(c("start", "end"), dmy),
         antibiotic = factor(antibiotic, levels = rev(antibiotic_levels))) %>%
  pivot_longer(
    cols = c("start", "end"),
    names_to = "start_end",
    values_to = "date"
  ) %>%
  ggplot(aes(x = date)) +
  geom_line(aes(y = antibiotic, colour = antibiotic, group = id), linewidth = 5)

希望这对你有帮助。

英文:

It's because you have multiple start dates for each level of the factor. fct_reorder() doesn't know which one to choose. One option is to use slice_min() and construct the factor levels based on the first date shown:

antibiotic_levels &lt;- df |&gt; 
  mutate(across(c(&quot;start&quot;, &quot;end&quot;), dmy)) |&gt; 
  group_by(antibiotic) |&gt; 
  slice_min(start) |&gt; 
  arrange(start) |&gt; 
  mutate(antibiotic = factor(antibiotic, levels = antibiotic)) |&gt; 
  pull(antibiotic)

df %&gt;%
  mutate(across(c(&quot;start&quot;, &quot;end&quot;), dmy),
         antibiotic = factor(antibiotic, levels = rev(antibiotic_levels))) |&gt; 
  pivot_longer(
    cols = c(&quot;start&quot;, &quot;end&quot;),
    names_to = &quot;start_end&quot;,
    values_to = &quot;date&quot;
  ) %&gt;%
  ggplot(aes(x = date)) +
  geom_line(aes(y = antibiotic, colour = antibiotic, group = id), linewidth = 5)

which gives:

按日期重新排列循环项目

huangapple
  • 本文由 发表于 2023年4月4日 04:58:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/75923709.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定