英文:
Create groups based on contiguous rows for displaying in a ggplot line chart
问题
我有一个扩展的时间序列,并希望在图表中突出显示一些感兴趣的趋势。在这方面做些调整:
问题是,突出显示可能是高(红色)或低(绿色),所以我也需要将颜色美学映射到图表上。当我这样做时,ggplot会假设我希望所有突出显示的点是连续的组(这是有道理的!)并且在间隙上画一条线:
明显的解决办法是将'group'美学映射到每个小趋势上,但我事先不知道每个趋势将在何时开始和结束,所以我很难找到一种即时做到的方法。
是否有一种方法可以即时识别这些组(也许使用'lag()'函数或其他方法?)并创建一个用于分组映射的列,或者一种告诉ggplot“嘿,如果我在这两个点之间有一个NA,请断开线,我们以后再谈”的方法?
这是我用来生成上述第二个图表的代码(第一个是相同的代码,但手动设置颜色而不是映射):
库(dplyr)
库(broom)
库(ggplot2)
df <- sunspot.year %>%
整理() %>%
重命名(年=索引,太阳黑子=值)
hl_df <- df %>%
过滤(年份>= 1750 & 年份<= 1800 |
年份>= 1875 & 年份<= 1900)%>%
突变(突出显示=太阳黑子)%>%
突变(highlight_colour ='high')
df %>%
左连接(hl_df)%>%
突变(年份= paste0(年份,'-01-01'))%>%
突变(年份= as.Date(年份))%>%
ggplot(aes(x =年,y =太阳黑子))+
几何线() +
几何线(aes(y = highlight,col = highlight_colour))+
比例颜色手动(值= c('high'= 'darkred','low'= 'darkolivegreen'))+
主题_bw() +
主题(图例位置='none')
英文:
I have an extended time series, and wish to highlight some trends of interest within the chart. Something along these lines:
The problem is, the highlights could be high (red) or low (green), so I need to map a colour aesthetic to the chart as well. When I do so, ggplot assumes I want all the highlighted points to be a contiguous group (which makes sense!) and puts a line across the gap as well:
The obvious solution is to map the 'group' aesthetic to keep each mini-trend on its own group, but I don't know ahead of time where each trend will start and end, so I am struggling to find a way to do that on the fly.
Is there either a way to identify these groups on the fly (maybe with using a lag()
function or something?) and create a column to map for groups, or else a way to tell ggplot "Hey, if I have an NA between these two points, break the line and we'll talk again later"?
Here's the code I used to produce the second chart above (the first is the same code but with colour set manually rather than mapped):
library(dplyr)
library(broom)
library(ggplot2)
df <- sunspot.year %>%
tidy() %>%
rename(year = index, sunspots = value)
hl_df <- df %>%
filter(year >= 1750 & year <= 1800 |
year >= 1875 & year <= 1900) %>%
mutate(highlight = sunspots) %>%
mutate(highlight_colour = 'high')
df %>%
left_join(hl_df) %>%
mutate(year = paste0(year,'-01-01')) %>%
mutate(year = as.Date(year)) %>%
ggplot(aes(x = year, y = sunspots)) +
geom_line() +
geom_line(aes(y = highlight, col = highlight_colour)) +
scale_colour_manual(values = c('high' = 'darkred', 'low' = 'darkolivegreen')) +
theme_bw() +
theme(legend.position = 'none')
答案1
得分: 1
Add group = 1
to the top level aesthetic (or just in geom_line()
I think should work too):
在顶级美学中添加 group = 1
(或者只在 geom_line()
中添加也应该可以工作):
df %>%
left_join(hl_df) %>%
mutate(year = paste0(year, '-01-01')) %>%
mutate(year = as.Date(year)) %>%
ggplot(aes(x = year, y = sunspots, group = 1)) +
geom_line() +
geom_line(aes(y = highlight, col = highlight_colour)) +
scale_colour_manual(values = c('high' = 'darkred', 'low' = 'darkolivegreen')) +
theme_bw() +
theme(legend.position = 'none')
The idea here is that group
is a separate (easy to overlook) aesthetic that you can set. When you map something to color
, group
is inferred from that, so if you want a different "grouping" for the continuous line drawn you have to reset the group
aesthetic. Setting group = 1
is the way to tell ggplot()
to treat everything as a single group.
这里的想法是 group
是一个单独的(容易被忽视的)美学属性,你可以设置它。当你将某个东西映射到 color
时,group
从中推断出来,所以如果你想要为连续线绘制不同的“分组”,你必须重新设置 group
美学。设置 group = 1
是告诉 ggplot()
将所有东西视为单一组的方法。
Separate from the group = 1
issue, you're plotting things multiple times with the duplicated geom_line()
call. When you add in group = 1
to your bag of tricks, that over-plotting isn't necessary and removes some of the anti-aliasing artifacts that are introduced when you plot things on top of themselves:
除了 group = 1
问题之外,你使用了重复的 geom_line()
调用多次来绘制图形。当你在技巧中添加 group = 1
时,这种过度绘制是不必要的,并且可以消除一些当你将东西绘制在它们自己的顶部时引入的反锯齿伪影:
df %>%
left_join(hl_df) %>%
mutate(year = paste0(year, '-01-01')) %>%
mutate(year = as.Date(year)) %>%
ggplot(aes(x = year, y = sunspots, group = 1, color = highlight_colour)) +
geom_line() +
scale_colour_manual(values = c('high' = 'red', 'low' = 'green')) +
theme_bw() +
theme(legend.position = 'none')
英文:
Add group = 1
to the top level aesthetic (or just in geom_line()
I think should work too):
df %>%
left_join(hl_df) %>%
mutate(year = paste0(year,'-01-01')) %>%
mutate(year = as.Date(year)) %>%
ggplot(aes(x = year, y = sunspots,group = 1)) +
geom_line() +
geom_line(aes(y = highlight, col = highlight_colour)) +
scale_colour_manual(values = c('high' = 'darkred', 'low' = 'darkolivegreen')) +
theme_bw() +
theme(legend.position = 'none')
The idea here is that group
is a separate (easy to overlook) aesthetic that you can set. When you map something to color
, group
is inferred from that, so if you want a different "grouping" for the continuous line drawn you have to reset the group
aesthetic. Setting group = 1
is the way to tell ggplot()
to treat everything as a single group.
Separate from the group = 1
issue, you're plotting things multiple times with the duplicated geom_line()
call. When you add in group = 1
to your bag of tricks, that over-plotting isn't necessary and removes some of the anti-aliasing artifacts that are introduced when you plot things on top of themselves:
df %>%
left_join(hl_df) %>%
mutate(year = paste0(year,'-01-01')) %>%
mutate(year = as.Date(year)) %>%
ggplot(aes(x = year, y = sunspots,group = 1,color = highlight_colour)) +
geom_line() +
scale_colour_manual(values = c('high' = 'red', 'low' = 'green')) +
theme_bw() +
theme(legend.position = 'none')
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论