英文:
A ggplot solution for plotting a Tukey test result from the agricolae package in correct (numerical) order on the x axis?
问题
I am plotting some plant and soil data along a transect covering the sampling years 2000-2020, testing simple one-way ANOVA with the excellent agricolae package, and doing simple explorative base plot graphs.
However, I find it difficult to manipulate with the output from the agricolae package and the base plot function, and I would now like to make a nice (preferably ggplot) graph with data in proper order in presentation quality.
My data (named "biochemA") looks like this:
Year Transect Depth NH4_(ug/L)
2000 1 1 391,777
2009 1 1 471,616
2011 1 1 362,964
2013 1 1 348,208
2015 1 1 234,341
2017 1 1 768,844
2019 1 1 599,063
2020 1 1 912,435
2000 2 1 452,272
2009 2 1 391,134
2011 2 1 285,286
2013 2 1 755,01
2015 2 1 376,022
2017 2 1 1205,095
2019 2 1 2940,163
2020 2 1 298,487
2000 3 1 1409,322
2009 3 1 847,658
2011 3 1 332,635
2013 3 1 487,695
2015 3 1 337,721
2017 3 1 1702,21
2019 3 1 2684,409
2020 3 1 448,644
- and I have done a really simple one-way ANOVA + Tukey test, using the agricolae package:
modelNH4<-aov(biochemA$NH4_(ug/L) ~ Year, data = biochemA)
out_NH4 <- HSD.test(modelNH4,"Year", group=TRUE,console=TRUE)
plot(out_NH4,main="NH4", xlab="Year", ylab="μg/g soil", las=2)
This gives me a graph like this:
which is fine for exploratory purposes. However, I would like to plot it with the years on the x-axis in chronological order (2020-2000), whereas the base plot function automatically displays the graphs by decreasing means. Also I would like to show standard error and not standard deviations.
The output from the HSD test looks like this:
out_bioANH4
$statistics
MSerror Df Mean CV MSD
173.8115 38 11.83804 111.3678 26.72761
$parameters
test name.t ntr StudentizedRange alpha
Tukey Year 8 4.53321 0.05
$means
biochemA1$NH4_.ug.g_soil._ std r Min Max
2000 7.662795 6.182201 5 3.000779 18.067119
2009 6.314628 3.589682 5 2.905835 12.042757
2011 4.866242 1.180330 5 3.698008 6.192181
2013 6.401148 3.330437 5 3.628023 10.752502
2015 4.286512 1.424005 5 2.777662 6.101243
2017 14.385232 8.981692 5 6.405698 29.624701
2019 42.317093 18.145402 5 12.244628 57.310143
2020 8.470637 4.223832 5 4.573570 15.409000
Q25 Q50 Q75
2000 3.547272 5.327341 8.371463
2009 3.728181 6.008561 6.887809
2011 4.097738 4.257121 6.086162
2013 3.684275 4.766714 9.174225
2015 3.089635 4.135425 5.328595
2017 9.569202 12.620927 13.705629
2019 38.758582 50.761796 52.510317
2020 5.761486 7.738720 8.870408
$comparison
NULL
$groups
biochemA1$NH4_.ug.g_soil._ groups
2019 42.317093 a
2017 14.385232 b
2020 8.470637 b
2000 7.662795 b
2013 6.401148 b
2009 6.314628 b
2011 4.866242 b
2015 4.286512 b
attr(,"class")
1 "group"
英文:
I am plotting some plant and soil data along a transect covering the sampling years 2000-2020, testing simple one-way ANOVA with the excellent agricolae package, and doing simple explorative base plot graphs.
However, I find it difficult to manipulate with the output from the agricolae package and the base plot function, and I would now like to make a nice (preferably ggplot) graph with data in proper order in presentation quality.
My data (named "biochemA") looks like this:
Year Transect Depth NH4_(ug/L)
2000 1 1 391,777
2009 1 1 471,616
2011 1 1 362,964
2013 1 1 348,208
2015 1 1 234,341
2017 1 1 768,844
2019 1 1 599,063
2020 1 1 912,435
2000 2 1 452,272
2009 2 1 391,134
2011 2 1 285,286
2013 2 1 755,01
2015 2 1 376,022
2017 2 1 1205,095
2019 2 1 2940,163
2020 2 1 298,487
2000 3 1 1409,322
2009 3 1 847,658
2011 3 1 332,635
2013 3 1 487,695
2015 3 1 337,721
2017 3 1 1702,21
2019 3 1 2684,409
2020 3 1 448,644
- and I have done a really simple one-way ANOVA + Tukey test, using the agricolae package:
> modelNH4<-aov(biochemA$NH4_(ug/L) ~ Year, data = biochemA)
>
> out_NH4 <- HSD.test(modelNH4,"Year", group=TRUE,console=TRUE)
>
> plot(out_NH4,main="NH4", xlab="Year", ylab="μg/g soil", las=2)
This gives me a graph like this:
which is fine for exploratory purposes. However, I would like to plot it with the years on the x-axis in chronological order (2020-2000), whereas the base plot function automatically displays the graphs by decreasing means. Also I would like to show standard error and not standard deviations.
The output from the HSD tst looks like this:
> out_bioANH4
$statistics
MSerror Df Mean CV MSD
173.8115 38 11.83804 111.3678 26.72761
$parameters
test name.t ntr StudentizedRange alpha
Tukey Year 8 4.53321 0.05
$means
biochemA1$NH4_.ug.g_soil._ std r Min Max
2000 7.662795 6.182201 5 3.000779 18.067119
2009 6.314628 3.589682 5 2.905835 12.042757
2011 4.866242 1.180330 5 3.698008 6.192181
2013 6.401148 3.330437 5 3.628023 10.752502
2015 4.286512 1.424005 5 2.777662 6.101243
2017 14.385232 8.981692 5 6.405698 29.624701
2019 42.317093 18.145402 5 12.244628 57.310143
2020 8.470637 4.223832 5 4.573570 15.409000
Q25 Q50 Q75
2000 3.547272 5.327341 8.371463
2009 3.728181 6.008561 6.887809
2011 4.097738 4.257121 6.086162
2013 3.684275 4.766714 9.174225
2015 3.089635 4.135425 5.328595
2017 9.569202 12.620927 13.705629
2019 38.758582 50.761796 52.510317
2020 5.761486 7.738720 8.870408
$comparison
NULL
$groups
biochemA1$NH4_.ug.g_soil._ groups
2019 42.317093 a
2017 14.385232 b
2020 8.470637 b
2000 7.662795 b
2013 6.401148 b
2009 6.314628 b
2011 4.866242 b
2015 4.286512 b
attr(,"class")
[1] "group"
答案1
得分: 1
你可能需要做的是从out_bioANH4
中提取元素并在自定义绘图中使用,比如ggplot2
。
如果在提示符下键入out_bioANH4
,你会看到out_bioANH4$means
包含数据的描述统计信息矩阵。如果你使用ggplot2
,你可以使用geom_errorbar
来生成类似于agricolae
包的图形。
看看以下代码会得到什么结果:
library(ggplot2)
# 从摘要统计信息计算SE并将其添加到摘要矩阵中
out_bioANH4$means$se <- out_bioANH4$means$std/sqrt(out_bioANH4$means$r)
ggplot(data=out_bioANH4$means) +
# 计算ymin和ymax,即均值+/- SE的位置
geom_errorbar(aes(ymin=out_bioANH4$means$biochemA1$NH4_.ug.g_soil._ - out_bioANH4$means$se,
ymax=out_bioANH4$means$biochemA1$NH4_.ug.g_soil._ + out_bioANH4$means$se,
x=row.names(out_bioANH4$means), color=out_bioANH4$groups$groups), width=0) +
# 在图上放置均值点,并为每个点使用不同的颜色
geom_point(aes(y=out_bioANH4$means$biochemA1$NH4_.ug.g_soil._,
x=row.names(out_bioANH4$means),
color=out_bioANH4$groups$groups[order(row.names(out_NH4$groups))])) +
xlab("Year") +
ylab("NH4_(ug/L)") +
scale_colour_discrete(name="Group")
我无法使用你的原始数据在你的"biochemA"示例中复制你的结果。如果你的out_bioANH4
示例是正确的,这些ggplot2
的内容应该可以帮助你入门。
由于geom_errorbar
中的x
是按数字顺序排列的,所以x轴应该按递增顺序排列。如果你想反转它,你可以很容易地使用ggplot
来做到这一点。
英文:
What you'll probably have to do is take the elements from out_bioANH4
and use them in your custom plot, such as ggplot2
.
If you type out_bioANH4
at the prompt, you'll see that out_bioANH4$means
contains a matrix of the descriptive statistics of your data. If you use ggplot2
, you can get similar plots as the agricolae
package using geom_errorbar
.
See what this gives you:
library(ggplot2)
#this calculates SE from the summary statistics and adds it to the summary matrix
out_bioANH4$means$se <- out_bioANH4$means$std/sqrt(out_bioANH4$means$r)
ggplot(data=out_bioANH4$means) +
#this calculates ymin and ymax, which are the location of mean +/- SE
geom_errorbar(aes(ymin=out_bioANH4$means$biochemA1$NH4_.ug.g_soil._ - out_bioANH4$means$se,
ymax=out_bioANH4$means$biochemA1$NH4_.ug.g_soil._ + out_bioANH4$means$se,
x=row.names(out_bioANH4$means), color=out_bioANH4$groups$groups), width=0) +
#this puts points at the means of the graph and does a different color for each
geom_point(aes(y=out_bioANH4$means$biochemA1$NH4_.ug.g_soil._,
x=row.names(out_bioANH4$means),
color=out_bioANH4$groups$groups[order(row.names(out_NH4$groups))])) +
xlab("Year") +
ylab("NH4_(ug/L)") +
scale_colour_discrete(name="Group")
I couldn't reproduce your results with HSD using your raw data in your "biochemA" example. If your out_bioANH4
example is correct, this ggplot2
stuff should get you started.
Because the x
in the geom_errorbar
is in numerical order, the x axis should be ordered in increasing order. If you want to reverse it, you can do that easily with ggplot
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论