英文:
Density plot starts earlier than the available data
问题
我已生成了一个密度图,显示了两种药物随时间分布的情况。数据集包含3列,包括患者ID、处方日期和药物名称。数据库中药物A的最早处方日期是2003年11月4日,而药物B的最早处方日期是2012年1月31日。然而,药物B的图表似乎在2012年1月之前开始,这非常奇怪。
要创建这个图,我使用了下面的代码以及geom_density函数。我不确定为什么会出现这种情况。是否有人有任何想法?
drugs %>%
ggplot(aes(x=date, fill=drug_name, colour=drug_name)) +
geom_density(alpha=0.5)+
theme_minimal()+
scale_x_date(date_breaks = "1 year",
date_labels = "%Y")+
labs(x = "Years", y = "Density")
英文:
I have generated a density plot that displays the distribution of two drugs over a period of time. The dataset contains 3 columns, including patient ID, date of prescription, and drug name. The earliest prescription for drug A in the database is dated 2003-11-04, while for drug B, it is 2012-01-31. However, the plot for drug B appears to start before Jan-2012 which is quite odd.
To create the plot, I used the code below along with the geom_density function. I'm not sure why this is happening. Does anyone have any ideas?
drugs %>%
ggplot(aes(x=date, fill=drug_name, colour=drug_name)) +
geom_density(alpha=0.5)+
theme_minimal()+
scale_x_date(date_breaks = "1 year",
date_labels = "%Y")+
labs(x = "Years", y = "Density")
答案1
得分: 0
密度默认情况下是在两个组的数据的完整范围上计算的。将trim = TRUE
传递给geom_density()
以分别在每个组的范围上计算密度。
英文:
The densities are by default calculated on the full range of the data for both groups.
Pass trim = TRUE
to geom_density()
to calculate the densities on the range of each group separately.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论