2023年5月14日 16:29:58go评论96阅读模式

英文:

GGplot: Line in stat_summary is not being displayed

问题

I'm here to provide translations for the code part of your request. Here is the translated code:

我正在尝试绘制一张图，显示每个变量中每个组的数据点之间的连线。但是，似乎ggplot拒绝绘制线，我不确定原因在哪里。
以下是代码：
```R
ggplot(DF_Compound, aes(color=MainThemes, y=value, x=Variable)) +
  stat_summary(
    fun.y="mean",
    geom="point",
    size=2,
  ) +
  stat_summary(
    fun.y="mean",
    geom="line",
    size=1,
  ) +
  ylim(0,9) +
  ylab("") +
  scale_y_continuous(breaks = c(1,2,3,4,5,6,7)) +
  theme_minimal() +
  scale_fill_discrete(guide = guide_legend()) +
  theme(legend.position="bottom")

这是输出图像：

这是我想要的（当然，每个主题都有一条线）：

（对于图像的粗糙处理表示歉意）

MainThemes和Variable都是因子（如果这有任何区别的话）。如果我去掉点，坐标系统就会变为空白。我是不是忽略了什么？我看到有人这样添加线，似乎可以工作。

编辑：尝试提供数据后，我发现问题似乎在于"Variable"是一个因子。这个数据集可以工作：

# 此处插入数据集的结构

这个数据集不可以工作：

# 此处插入不可工作数据集的结构

将变量转换为数字是一种解决方法，但如果可能的话，我真的想避免这样做。

谢谢！


Please note that I've translated only the code portion, as per your request. If you have any other questions or need further assistance, feel free to ask.
<details>
<summary>英文:</summary>
I am trying to have a graph showing a line connecting the dots of each group in each variable.
However it seems like ggplot refuses to render the line and I am not sure why.
Here is the code:

ggplot(DF_Compound, aes(color=MainThemes, y=value, x=Variable)) +
stat_summary(
fun.y="mean",
geom="point",
size=2,
) +
stat_summary(
fun.y="mean",
geom="Line",
size=1,
) +
ylim(0,9) +
ylab("") +
scale_y_continuous(breaks = c(1,2,3,4,5,6,7)) +
theme_minimal() +
scale_fill_discrete(guide = guide_legend()) +
theme(legend.position="bottom")


This is the output:
[![enter image description here][1]][1]
This is what I would like (with a line for every Theme of course):
[![enter image description here][2]][2]
(Apologies for the sloppy shop on the images)
Both MainThemes and Variable are factors (if that makes any difference). If I remove the points, the coordinate system is just empty. Am I overlooking something? I&#39;ve seen people add lines this way and it seems to work.
Edit:
Trying to provide data I figured out, the problem seems to be that &quot;Variables&quot; is a factor.
This dataset works:

structure(list(MainThemes = c("F", "F", "F", "F", "F", "F", "F",
"F", "F", "F", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C",
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "G", "G", "G",
"G", "G", "G", "G", "G", "G", "G", "E", "E", "E", "E", "E", "E",
"E", "E", "E", "E", "C", "C", "C", "C", "C", "C", "C", "C", "C",
"C", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A",
"A", "A", "A", "A", "A", "A", "A", "A", "F", "F", "F", "F", "F",
"F", "F", "F", "F", "F", "C", "C", "C", "C", "C", "C", "C", "C",
"C", "C"), Variable = c(9, 5, 8, 2, 3, 7, 10, 1, 6, 4, 9, 5,
8, 2, 3, 7, 10, 1, 6, 4, 9, 5, 8, 2, 3, 7, 10, 1, 6, 4, 9, 5,
8, 2, 3, 7, 10, 1, 6, 4, 9, 5, 8, 2, 3, 7, 10, 1, 6, 4, 9, 5,
8, 2, 3, 7, 10, 1, 6, 4, 9, 5, 8, 2, 3, 7, 10, 1, 6, 4, 9, 5,
8, 2, 3, 7, 10, 1, 6, 4, 9, 5, 8, 2, 3, 7, 10, 1, 6, 4, 9, 5,
8, 2, 3, 7, 10, 1, 6, 4), value = c(3.33333333333333, 5.33333333333333,
5, 3.33333333333333, 5.33333333333333, 5.33333333333333, 5.33333333333333,
5.66666666666667, 4.66666666666667, 5.66666666666667, 1, 3.66666666666667,
3.33333333333333, 4, 5.33333333333333, 4.66666666666667, 1.33333333333333,
7, 6.66666666666667, 2, 4.33333333333333, 6.33333333333333, 1.66666666666667,
2.66666666666667, 2, 3.66666666666667, 2, 6.33333333333333, 3.33333333333333,
6, 2, 2, 2.33333333333333, 2, 4.66666666666667, 2.33333333333333,
5.33333333333333, 3, 6, 6, 2, 2, 2.33333333333333, 2, 4.66666666666667,
2.33333333333333, 5.33333333333333, 3, 6, 6, 2, 4, 3, 1.33333333333333,
4, 4.33333333333333, 4, 4.66666666666667, 3, 4, 4, 4, 1, 4, 3,
5, 5.33333333333333, 6, 4.33333333333333, 5.66666666666667, 5.33333333333333,
6.66666666666667, 4, 5.33333333333333, 4.66666666666667, 6.33333333333333,
4.66666666666667, 6.33333333333333, 6, 5.66666666666667, 7, 5,
4.33333333333333, 1.33333333333333, 3.33333333333333, 7, 3.33333333333333,
7, 5, 5.33333333333333, 4.66666666666667, 4.66666666666667, 4,
2, 4.66666666666667, 1.66666666666667, 4.33333333333333, 6.33333333333333,
6.33333333333333, 5.33333333333333)), row.names = c(NA, -100L
), class = c("tbl_df", "tbl", "data.frame"))


This one doesn&#39;t:

Variables are vectors:
structure(list(MainThemes = c("F", "F", "F", "F", "F", "F", "F",
"F", "F", "F", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C",
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "G", "G", "G",
"G", "G", "G", "G", "G", "G", "G", "E", "E", "E", "E", "E", "E",
"E", "E", "E", "E", "C", "C", "C", "C", "C", "C", "C", "C", "C",
"C", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A",
"A", "A", "A", "A", "A", "A", "A", "A", "F", "F", "F", "F", "F",
"F", "F", "F", "F", "F", "C", "C", "C", "C", "C", "C", "C", "C",
"C", "C"), Variable = c("VI", "VE", "VH", "VB", "VC", "VG", "VJ",
"VA", "VF", "VD", "VI", "VE", "VH", "VB", "VC", "VG", "VJ", "VA",
"VF", "VD", "VI", "VE", "VH", "VB", "VC", "VG", "VJ", "VA", "VF",
"VD", "VI", "VE", "VH", "VB", "VC", "VG", "VJ", "VA", "VF", "VD",
"VI", "VE", "VH", "VB", "VC", "VG", "VJ", "VA", "VF", "VD", "VI",
"VE", "VH", "VB", "VC", "VG", "VJ", "VA", "VF", "VD", "VI", "VE",
"VH", "VB", "VC", "VG", "VJ", "VA", "VF", "VD", "VI", "VE", "VH",
"VB", "VC", "VG", "VJ", "VA", "VF", "VD", "VI", "VE", "VH", "VB",
"VC", "VG", "VJ", "VA", "VF", "VD", "VI", "VE", "VH", "VB", "VC",
"VG", "VJ", "VA", "VF", "VD"), value = c(3.33333333333333, 5.33333333333333,
5, 3.33333333333333, 5.33333333333333, 5.33333333333333, 5.33333333333333,
5.66666666666667, 4.66666666666667, 5.66666666666667, 1, 3.66666666666667,
3.33333333333333, 4, 5.33333333333333, 4.66666666666667, 1.33333333333333,
7, 6.66666666666667, 2, 4.33333333333333, 6.33333333333333, 1.66666666666667,
2.66666666666667, 2, 3.66666666666667, 2, 6.33333333333333, 3.33333333333333,
6, 2, 2, 2.33333333333333, 2, 4.66666666666667, 2.33333333333333,
5.33333333333333, 3, 6, 6, 2, 2, 2.33333333333333, 2, 4.66666666666667,
2.33333333333333, 5.33333333333333, 3, 6, 6, 2, 4, 3, 1.33333333333333,
4, 4.33333333333333, 4, 4.66666666666667, 3, 4, 4, 4, 1, 4, 3,
5, 5.33333333333333, 6, 4.33333333333333, 5.66666666666667, 5.33333333333333,
6.66666666666667, 4, 5.33333333333333, 4.66666666666667, 6.33333333333333,
4.66666666666667, 6.33333333333333, 6, 5.66666666666667, 7, 5,
4.33333333333333, 1.33333333333333, 3.33333333333333, 7, 3.33333333333333,
7, 5, 5.33333333333333, 4.66666666666667, 4.66666666666667, 4,
2, 4.66666666666667, 1.66666666666667, 4.33333333333333, 6.33333333333333,
6.33333333333333, 5.33333333333333)), row.names = c(NA, -100L
), class = c("tbl_df", "tbl", "data.frame"))


Turning my variables into numbers is a workaround, but I would really like to avoid it if possible.
Thanks!
  [1]: https://i.stack.imgur.com/bvL5z.png
  [2]: https://i.stack.imgur.com/z2D1L.png
</details>
# 答案1
**得分**: 1
You need to add `MainThemes` as a grouping variable if you have a discrete x axis:
```r
DF_Compound |&gt;
  ggplot(aes(color = MainThemes, y = value, x = Variable, group = MainThemes)) +
  stat_summary(fun.y=&quot;mean&quot;, geom=&quot;point&quot;, size = 2) +
  stat_summary(fun.y=&quot;mean&quot;,geom=&quot;Line&quot;,linewidth = 1) +
  scale_y_continuous(NULL, breaks = 0:9, limits = c(0, 9)) +
  theme_minimal() +
  theme(legend.position = &quot;bottom&quot;)

A few other points to note are:

Since you are using scale_y_continuous already, then instead of using ylab(""), you can pass NULL as the first argument to scale_y_continuous to remove the axis label.
scale_fill_discrete doesn't do anything here, since you aren't using the fill scale. However, changing it to scale_color_discrete(guide = guide_legend()) won't do anything either because this is the default setting.
The size argument in geom_line is deprecated - if your version of ggplot is up to date, you should use linewidth instead.
I'm not sure that a line plot is appropriate here since you have a discrete x axis (though this is a bit domain-specific). An alternative if you want to display the relationship between two discrete variables and one continuous variable would be a heatmap. This gives a cleaner plot, though at the expense of making it a little harder to discern the exact value:

library(tidyverse)
DF_Compound %&gt;%
  group_by(MainThemes, Variable) %&gt;%
  summarise(value = mean(value)) %&gt;%
  ggplot(aes(y = MainThemes, x = Variable, fill = value)) +
  geom_tile() +
  theme_minimal(base_size = 20) +
  scale_fill_viridis_c() +
  coord_equal() +
  theme(legend.position = &quot;bottom&quot;)

If you have the space to do it, you could even get a better idea of the range of your data by using faceted boxplots or violin plots:

library(tidyverse)
DF_Compound %&gt;%
  ggplot(aes(y = value, x = Variable, fill = MainThemes)) +
  geom_point(size = 1, shape = 21) +
  geom_violin(linewidth = 0.3) +
  facet_grid(MainThemes~.) +
  scale_fill_brewer(palette = &quot;Pastel1&quot;) +
  theme_bw(base_size = 16) +
  theme(legend.position = &quot;none&quot;)

英文:

You need to add MainThemes as a grouping variable if you have a discrete x axis:

DF_Compound |&gt;
  ggplot(aes(color = MainThemes, y = value, x = Variable, group = MainThemes)) +
  stat_summary(fun.y=&quot;mean&quot;, geom=&quot;point&quot;, size = 2) +
  stat_summary(fun.y=&quot;mean&quot;,geom=&quot;Line&quot;,linewidth = 1) +
  scale_y_continuous(NULL, breaks = 0:9, limits = c(0, 9)) +
  theme_minimal() +
  theme(legend.position = &quot;bottom&quot;)

A few other points to note are:

Since you are using scale_y_continuous already, then instead of using ylab(""), you can pass NULL as the first argument to scale_y_continuous to remove the axis label.
scale_fill_discrete doesn't do anything here, since you aren't using the fill scale. However, changing it to scale_color_discrete(guide = guide_legend()) won't do anything either because this is the default setting.
The size argument in geom_line is deprecated - if your version of ggplot is up to date, you should use linewidth instead.
I'm not sure that a line plot is appropriate here since you have a discrete x axis (though this is a bit domain-specific). An alternative if you want to display the relationship between two discrete variables and one continuous variables would be a heatmap. This gives a cleaner plot, though at the expense of making it a little harder to discern the exact value:

library(tidyverse)
DF_Compound %&gt;%
  group_by(MainThemes, Variable) %&gt;%
  summarise(value = mean(value)) %&gt;%
  ggplot(aes(y = MainThemes, x = Variable, fill = value)) +
  geom_tile() +
  theme_minimal(base_size = 20) +
  scale_fill_viridis_c() +
  coord_equal() +
  theme(legend.position = &quot;bottom&quot;)

If you have the space to do it, you could even get a better idea of the range of your data by using faceted boxplots or violin plots:

library(tidyverse)
DF_Compound %&gt;%
  ggplot(aes(y = value, x = Variable, fill = MainThemes)) +
  geom_point(size = 1, shape = 21) +
  geom_violin(linewidth = 0.3) +
  facet_grid(MainThemes~.) +
  scale_fill_brewer(palette = &quot;Pastel1&quot;) +
  theme_bw(base_size = 16) +
  theme(legend.position = &quot;none&quot;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

GGplot：stat_summary中的线未显示。

问题

在R中，如果数据框中的项目被特殊字符如”+”分隔，可以创建一个新的行。

我在调用Rcpp例程中的runif时遇到了一个奇怪的问题。

意外错误 – 将日期时间格式化为变异

在R中按重复日期绑定或合并行。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。