英文:
Coloring by groups using ggscatter
问题
I'm trying to color my dots on this plot:
我想要给这个图上的点着色:
q <- ggscatter(ARAL_R, x = "Intl1", y = "Sul1", add = "reg.line", conf.int = TRUE, cor.coef = TRUE, cor.method = "spearman", xlab = "Intl1 (copies/g)", ylab = "Sul1 (copies/g)") + stat_cor(label.y = 1, label.x.npc = "center") + stat_regline_equation(label.y = 0.5, label.x.npc = "center")
但是每当我添加 color = "Core"
或 fill = "Core"
命令时(其中 A 是一个有 5 个类别的变量),就会出现以下情况:
但是每当我添加 color = "Core"
或 fill = "Core"
命令时(其中 A 是一个有 5 个类别的变量),就会出现以下情况:
q <- ggscatter(ARAL_R, x = "Intl1", y = "Sul1", color = "Core", add = "reg.line", conf.int = TRUE, cor.coef = TRUE, cor.method = "spearman", xlab = "Intl1 (copies/g)", ylab = "Sul1 (copies/g)") + stat_cor(label.y = 1, label.x.npc = "center") + stat_regline_equation(label.y = 0.5, label.x.npc = "center")
ggpar(q, xscale = "log10", yscale = "log10")
我想要它像这样:
我想要它像这样:
But with the spearman's correlation line.
但是带有 Spearman 相关性线。
What am I doing wrong?
我做错了什么?
英文:
I'm trying to color my dots on this plot:
q <- ggscatter(ARAL_R, x = "Intl1", y = "Sul1", add = "reg.line", conf.int = TRUE, cor.coef = TRUE, cor.method = "spearman", xlab = "Intl1 (copies/g)", ylab = "Sul1 (copies/g)") + stat_cor(label.y = 1, label.x.npc = "center") + stat_regline_equation(label.y = 0.5, label.x.npc = "center")
But whenever I add the command color = "Core"
or fill = "Core"
(where A is a categorical variable of 5) this happens:
q <- ggscatter(ARAL_R, x = "Intl1", y = "Sul1", color = "Core", add = "reg.line", conf.int = TRUE, cor.coef = TRUE, cor.method = "spearman", xlab = "Intl1 (copies/g)", ylab = "Sul1 (copies/g)") + stat_cor(label.y = 1, label.x.npc = "center") + stat_regline_equation(label.y = 0.5, label.x.npc = "center")
ggpar(q, xscale = "log10", yscale = "log10")
I would like to make it like this:
But with the spearman's correlation line.
What am I doing wrong?
答案1
得分: 0
以下是您要翻译的内容:
问题在于当将变量映射到 color
上时,您的数据会分成多个组,并且您会获得每个 Core
组或类别的回归线、相关系数和方程式。要解决这个问题,您必须显式地在 group
aes 上进行映射,即使用 aes(group = 1)
来将所有观测值视为一个组,通常我们将其命名为 1
,这是为了简化或按照惯例。不幸的是,ggscatter
不提供这个选项。这是使用 "开箱即用" 选项而不是使用原始的 ggplot2
时的一个缺点之一。但我担心这是前进的方式,即使用 geom_smooth
手动添加您的回归线,而不依赖于 ggscatter
:
使用基于 iris
的最小可复制示例:
library(ggpubr)
#> Loading required package: ggplot2
ARAL_R <- iris[3:5]
names(ARAL_R) <- c("Intl1", "Sul1", "Core")
q <- ggscatter(
ARAL_R,
x = "Intl1", y = "Sul1", color = "Core",
xlab = "Intl1 (copies/g)", ylab = "Sul1 (copies/g)"
) +
geom_smooth(aes(group = 1), method = "lm", color = "black") +
stat_cor(aes(group = 1), label.y = 1, label.x.npc = "center") +
stat_regline_equation(aes(group = 1), label.y = 0.5, label.x.npc = "center")
ggpar(q, xscale = "log10", yscale = "log10")
#> `geom_smooth()` using formula = 'y ~ x'
作为参考,这是原始 ggplot2
的方式,其中我将 group
aes 替换为将 color
仅作为 geom_point
的本地 aes,并且仅使用 ggpubr
添加回归线方程和相关系数:
library(ggplot2)
library(ggpubr)
ARAL_R <- iris[3:5]
names(ARAL_R) <- c("Intl1", "Sul1", "Core")
ggplot(
ARAL_R,
aes(Intl1, Sul1),
) +
geom_smooth(method = "lm", color = "black") +
geom_point(aes(color = Core)) +
stat_cor(label.y = 1, label.x.npc = "center") +
stat_regline_equation(label.y = 0.5, label.x.npc = "center") +
labs(x = "Intl1 (copies/g)", y = "Sul1 (copies/g)") +
scale_x_log10() +
scale_y_log10()
#> `geom_smooth()` using formula = 'y ~ x'
英文:
The issue is that when mapping a variable on color
your data gets split into multiple groups and you get a regression line, correlation coefficient and equation for each group or category of Core
. To fix that you have to explicitly map on the group
aes, i.e. use aes(group = 1)
to treat all obs. as one group which for simplicity or by convention we name 1
. Unfortunately ggscatter
does not offer this option. That's one of the downsides when using "out-of-the-box" options instead of using vanilla ggplot2
. But I'm afraid this is the way to go, i.e. add your regression line manually using a geom_smooth
instead of relying on ggscatter
:
Using a minimal reproducible example based on iris
:
library(ggpubr)
#> Loading required package: ggplot2
ARAL_R <- iris[3:5]
names(ARAL_R) <- c("Intl1", "Sul1", "Core")
q <- ggscatter(
ARAL_R,
x = "Intl1", y = "Sul1", color = "Core",
xlab = "Intl1 (copies/g)", ylab = "Sul1 (copies/g)"
) +
geom_smooth(aes(group = 1), method = "lm", color = "black") +
stat_cor(aes(group = 1), label.y = 1, label.x.npc = "center") +
stat_regline_equation(aes(group = 1), label.y = 0.5, label.x.npc = "center")
ggpar(q, xscale = "log10", yscale = "log10")
#> `geom_smooth()` using formula = 'y ~ x'
<!-- -->
And as a reference here is the vanilla ggplot2
way, where instead of using the group
aes I made color
a local aes of the geom_point
only and where I use ggpubr
only to add the regline equation and correlation coefficient:
library(ggplot2)
library(ggpubr)
ARAL_R <- iris[3:5]
names(ARAL_R) <- c("Intl1", "Sul1", "Core")
ggplot(
ARAL_R,
aes(Intl1, Sul1),
) +
geom_smooth(method = "lm", color = "black") +
geom_point(aes(color = Core)) +
stat_cor(label.y = 1, label.x.npc = "center") +
stat_regline_equation(label.y = 0.5, label.x.npc = "center") +
labs(x = "Intl1 (copies/g)", y = "Sul1 (copies/g)") +
scale_x_log10() +
scale_y_log10()
#> `geom_smooth()` using formula = 'y ~ x'
<!-- -->
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论