英文:
recreating a lexis grid using ggplot
问题
我正在追踪一个回答,我得到了这个问题上关于创建Lexis网格的答案。虽然答案让我能够将我的数据与Lexis网格叠加,但由于我的数据密度,Lexis网格完全被填充物遮盖。我找到了一种将网格移到前面的hacky解决方案:
library(ggplot2)
p <- mylexis +
geom_tile(data = df, mapping = aes(x = as.Date(paste0(year, "-01-01")), y = age, fill = event))
p$layers <- p$layers[c(3, 1, 2)]
p
这最初起作用,但当我在绘图中添加更多细节以创建更多层时,它作为解决方案有点不完善。
所以现在我尝试完全绕过lexis_grid命令和LexisPlotR包。相反,我只想添加一系列垂直、水平和对角线。
我想要的是以下图像,来自这篇文章:
这是我正在尝试的:
library('dplyr')
library('ggplot2')
library('viridis')
df <- data.frame(
year <- sample(c(1900:2021), 1000, TRUE),
age <- sample(c(0:80), 1000, TRUE),
event <- sample(c(0:5), 1000, TRUE)
)
colnames(df) <- c("year", "age", "event")
ggplot(df, aes(x=year, y=age, fill=event)) +
geom_tile() +
scale_fill_viridis() +
geom_hline(yintercept=seq(0, 80, by=10)) +
geom_vline(xintercept=seq(1900,2030, by=10)) +
geom_abline(intercept=seq(0, 80, by=10), slope=1) +
labs(fill = "Count",
title = "Events")
这给我以下结果:
问题是,我不知道为什么在0线下和1900线的左侧有点。而且我不知道我在geom_abline方面做错了什么,但我无法使对角线起作用。
英文:
I am following up to an answer I got to this question on creating a lexis grid. While the answer got me to the point where I could overlay my data with a lexis grid, due to the density of my data, the lexis grid was completely obscured by the fill. I got a hacky sort of response for bringing the grid to the front with the solution:
library(ggplot2)
p <- mylexis +
geom_tile(data = df, mapping = aes(x = as.Date(paste0(year, "-01-01")), y = age, fill = event))
p$layers <- p$layers[c(3, 1, 2)]
p
This worked initially, however, as I added more detail to the plot that created more layers, it sort of fell apart as a solution.
So I am now trying to completely circumvent the lexis_grid command and the LexisPlotR package. Instead, I just want to add a sequence of vertical, horizontal, and diagonal lines.
What I want is along the lines of the following image, from this article:
This is what I am trying:
library('dplyr')
library('ggplot2')
library('viridis')
df <- data.frame(
year <- sample(c(1900:2021), 1000, TRUE),
age <- sample(c(0:80), 1000, TRUE),
event <- sample(c(0:5), 1000, TRUE)
)
colnames(df) <- c("year", "age", "event")
ggplot(df, aes(x=year, y=age, fill=event)) +
geom_tile() +
scale_fill_viridis() +
geom_hline(yintercept=seq(0, 80, by=10)) +
geom_vline(xintercept=seq(1900,2030, by=10)) +
geom_abline(intercept=seq(0, 80, by=10), slope=1) +
labs(fill = "Count",
title = "Events")
Which gets me the following:
The problem is, I don't know why there are dots below the 0 line and to the left of the 1900 line. And I have no idea what I'm doing wrong with the geom_abline, but I can't get the diagonals to work for anything.
答案1
得分: 1
以下是您要翻译的部分:
"The reason the tiles fill to the left of 1900 and below 0 is that the tiles are centered on the x
/y
coordinates, and they spill out in all directions based in width
/height
. I don't know of any way of showing dots with finite size that don't visually appear to spill out outside of the domain of values."
瓷砖填充到1900年左侧和0以下的原因是,这些瓷砖是基于width
/height
的x
/y
坐标的中心,它们向各个方向溢出。我不知道有什么办法可以显示有限大小的点,而这些点在数值域之外不会在视觉上溢出。
"The reason your abline
s do not show is that your intercept is assuming a 0,0
origin on the plot, but your x axis starts at 1900. The "real" (y-)intercept is far below 0; 1900 below, to be precise. If we accommodate that (and widen the range a bit), we can see the diagonal lines."
你的abline
不显示的原因是,你的截距假设绘图的原点是0,0
,但你的x轴从1900年开始。 "真正的" (y-)截距远低于0;确切地说,低于1900。如果我们适应这一点(并稍微扩大范围),我们就能看到对角线。
"Reproducible data (using set.seed
) and correcting for the unadvised use of <-
inside of data.frame
:"
"可复制的数据(使用set.seed
)并纠正了在data.frame
内部不建议使用<-
的情况:"
library('dplyr')
library('ggplot2')
library('viridis')
set.seed(42)
df <- data.frame(
year = sample(c(1900:2021), 1000, TRUE),
age = sample(c(0:80), 1000, TRUE),
event = sample(c(0:5), 1000, TRUE)
)
head(df)
# year age event
# 1 1948 66 1
# 2 2000 13 4
# 3 1964 72 0
# 4 1924 55 1
# 5 1973 43 2
# 6 1999 54 1
ggplot(df, aes(x=year, y=age, fill=event)) +
geom_tile() +
scale_fill_viridis() +
geom_hline(yintercept=seq(0, 80, by = 10)) +
geom_vline(xintercept=seq(1900, 2030, by = 10)) +
geom_abline(intercept=seq(0, 200, by = 10) - 2020, slope = 1) +
labs(fill = "Count", title = "Events")
上面是可重现的数据(使用set.seed
),并纠正了在data.frame
内部不建议使用<-
的情况的示例代码。
英文:
The reason the tiles fill to the left of 1900 and below 0 is that the tiles are centered on the x
/y
coordinates, and they spill out in all directions based in width
/height
. I don't know of any way of showing dots with finite size that don't visually appear to spill out outside of the domain of values.
The reason your abline
s do not show is that your intercept is assuming a 0,0
origin on the plot, but your x axis starts at 1900. The "real" (y-)intercept is far below 0; 1900 below, to be precise. If we accommodate that (and widen the range a bit), we can see the diagonal lines.
Reproducible data (using set.seed
) and correcting for the unadvised use of <-
inside of data.frame
:
library('dplyr')
library('ggplot2')
library('viridis')
set.seed(42)
df <- data.frame(
year = sample(c(1900:2021), 1000, TRUE),
age = sample(c(0:80), 1000, TRUE),
event = sample(c(0:5), 1000, TRUE)
)
head(df)
# year age event
# 1 1948 66 1
# 2 2000 13 4
# 3 1964 72 0
# 4 1924 55 1
# 5 1973 43 2
# 6 1999 54 1
ggplot(df, aes(x=year, y=age, fill=event)) +
geom_tile() +
scale_fill_viridis() +
geom_hline(yintercept=seq(0, 80, by = 10)) +
geom_vline(xintercept=seq(1900, 2030, by = 10)) +
geom_abline(intercept=seq(0, 200, by = 10) - 2020, slope = 1) +
labs(fill = "Count", title = "Events")
The use of seq(0, 200, by = 10)
is because we have 80/10=8
lines to draw originating from the left-border, and (2020-1900)/10=12
lines to draw originating from the bottom-border. You can change to seq(-10, 200, by = 10) - 2020
to fill in that last diagonal. It's okay to over-draw some ablines, they will be optimized out of the plot. (For instance, seq(-50, 300, by = 10) - 2020
works without otherwise affecting the x/y limits.)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论