英文:
Illustrating relative timeline in R
问题
我有一个包含两列的数据框:los
表示特定事件的停留时间,daysbetweenevent
。我想要创建一个可视化图表,显示一条单独的线,其中的段表示los
和daysbetweenvisits
的宽度交替显示。我考虑使用粗带来表示los
,而使用细带来表示daysbetweenvisits
。每个段落将占据相对于数字的长度 - 从本质上来说,显示一个时间线。
例如,段落的顺序将是:2-21-24-12-14-52...,每个los
段都很粗,每个daysbetweenvisits
都是一条线,并且最好在顶部标有数字。
df <- data.frame(los = c(2, 24, 14, 0, 4, 9, 7, 3, 1),
daysbetweenvisits = c(21, 12, 52, 218, 73, 36, 0, 18, 0))
英文:
I have a dataframe with two columns: los
that denotes length of stay for a particular event and daysbetweenevent
. I would like to create a visualization that shows a single line with segments illustrating width of los and daysbetween event alternating. I'm thinking a thick band for los
and a thin band for daysbetweenvisits
. And each segment would take up the relative length for the number - essentially showing a timeline.
For example, the order of the segments would be: 2-21-24-12-14-52...., with every los
segment being thick, and every daysbetweenvisits
being a line, with preferable label of the number on top.
df <- data.frame(los = c(2, 24, 14, 0, 4, 9, 7, 3, 1),
daysbetweenvisits = c(21, 12, 52, 218, 73, 36, 0, 18, 0))
答案1
得分: 3
以下是翻译好的内容:
首先,将数据转换为正确的格式是第一步。为此,您需要计算每个时间段的开始和结束,然后根据是否在“LOS”期间或“between visits”期间对其进行标记。
您可以通过转置数据框并将其连接成一个单一的数值向量来完成这个任务。然后,其累积和将是每个段的结束时间点。然后,将这个向量与第一个位置的0进行滞后操作,以获取开始时间。您可以将这些值放入一个数据框中,其中第二列只是在“LOS”和“BetweenVisits”之间交替:
df <- data.frame(right = cumsum(c(t(df))),
left = head(c(0, cumsum(c(t(df)))), -1),
event = rep(c('LOS', 'BetweenVisits'), len = nrow(df) * 2))
df$right <- ifelse(df$right == df$left, df$right + 1, df$right)
现在数据已经转换为绘图的正确格式:
df
#> right left event
#> 1 2 0 LOS
#> 2 23 2 BetweenVisits
#> 3 47 23 LOS
#> 4 59 47 BetweenVisits
#> 5 73 59 LOS
#> 6 125 73 BetweenVisits
#> 7 126 125 LOS
#> 8 343 125 BetweenVisits
#> 9 347 343 LOS
#> 10 420 347 BetweenVisits
#> 11 429 420 LOS
#> 12 465 429 BetweenVisits
#> 13 472 465 LOS
#> 14 473 472 BetweenVisits
#> 15 475 472 LOS
#> 16 493 475 BetweenVisits
#> 17 494 493 LOS
#> 18 495 494 BetweenVisits
然后,我们可以使用 ggplot2
绘制图表:
library(ggplot2)
ggplot(df, aes(left, '1', linewidth = event, group = event, color = event)) +
geom_segment(aes(xend = right, yend = '1')) +
geom_text(aes(left + (right - left)/2,
label = ifelse(event == 'BetweenVisits', right - left, '')),
nudge_y = 0.1, color = 'black') +
scale_linewidth_manual(values = c(1, 6), guide = 'none') +
scale_color_manual(values = c('gray', 'red3'), guide = 'none') +
theme_void()
于2023年7月27日使用 reprex v2.0.2创建
英文:
The first step is getting your data into the correct format. For this, you want to calculate the start of each time period and end of each time period, then label it according to whether it is during the 'LOS' period or the 'between visits' period.
You can do this by transposing your data frame and concatenating it into a single numeric vector. Then its cumulative sum will be the time points at the end of each segment. Then lag this vector with a 0 in the first position to get the start time. You can put these in a data frame with a second column that just alternates between 'LOS' and 'BetweenVisits':
df <- data.frame(right = cumsum(c(t(df))),
left = head(c(0, cumsum(c(t(df)))), -1),
event = rep(c('Stay', 'Between'), len = nrow(df) * 2))
df$right <- ifelse(df$right == df$left, df$right + 1, df$right)
So now the data is in the correct format for plotting:
df
#> right left event
#> 1 2 0 Stay
#> 2 23 2 Between
#> 3 47 23 Stay
#> 4 59 47 Between
#> 5 73 59 Stay
#> 6 125 73 Between
#> 7 126 125 Stay
#> 8 343 125 Between
#> 9 347 343 Stay
#> 10 420 347 Between
#> 11 429 420 Stay
#> 12 465 429 Between
#> 13 472 465 Stay
#> 14 473 472 Between
#> 15 475 472 Stay
#> 16 493 475 Between
#> 17 494 493 Stay
#> 18 495 494 Between
And we can draw the plot with ggplot2
library(ggplot2)
ggplot(df, aes(left, '1', linewidth = event, group = event, color = event)) +
geom_segment(aes(xend = right, yend = '1')) +
geom_text(aes(left + (right - left)/2,
label = ifelse(event == 'Between', right - left, '')),
nudge_y = 0.1, color = 'black') +
scale_linewidth_manual(values = c(1, 6), guide = 'none') +
scale_color_manual(values = c('gray', 'red3'), guide = 'none') +
theme_void()
<sup>Created on 2023-07-27 with reprex v2.0.2</sup>
答案2
得分: 1
以下是基于R语言的基本方法:
# 寻找住院时间的起始点
df$from <- c(0, cumsum(head(rowSums(df), -1)))
df$to <- df$from + df$los
par(mar=rep(0, 4))
frame()
plot.window(xlim=c(0, tail(df$to, 1)), ylim=c(0, 0))
segments(df$from[1], 0, tail(df$to, 1), lend=3)
segments(df$from, 0, df$to, lwd=16, col='red', lend=3)
text((head(df$to, -1) + df$from[-1]) / 2, 2*strheight('a'),
head(df$daysbetweenvisits, -1))
英文:
Here's base R approach:
# find LOS endpoints
df$from <- c(0, cumsum(head(rowSums(df), -1)))
df$to <- df$from + df$los
par(mar=rep(0, 4))
frame()
plot.window(xlim=c(0, tail(df$to, 1)), ylim=c(0, 0))
segments(df$from[1], 0, tail(df$to, 1), lend=3)
segments(df$from, 0, df$to, lwd=16, col='red', lend=3)
text((head(df$to, -1) + df$from[-1]) / 2, 2*strheight('a'),
head(df$daysbetweenvisits, -1))
I wasn't sure how to interpret the zero values so I just kept them zero.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论