ggforce包中geom_mark_ellipse函数生成的区域是什么?

huangapple go评论54阅读模式
英文:

What is the region produced by ggforce package (geom_mark_ellipse)

问题

这是我的可重现数据和代码:

dd <- structure(list(chr1_1005501 = c(0.597222222222222, 0.75, 0.775, 
                                    0.732456140350877, 0.860696517412935, 0.777777777777778, 0.654545454545455, 
                                    0.811224489795918, 0.742424242424242, 0.806146572104019, 0.828402366863905, 
                                    0.718181818181818, 0.85663082437276, 0.853868194842407, 0.849624060150376, 
                                    0.759615384615385, 0.798611111111111, 1), chr1_1018001 = c(0.615384615384615, 
                                                                                               1, 0.95959595959596, 0.976415094339623, 0.979166666666667, 0.920634920634921, 
                                                                                               0.981481481481482, 1, 0.94578313253012, 0.967213114754098, 0.986486486486487, 
                                                                                               0.952380952380952, 0.965116279069767, 0.986238532110092, 0.975206611570248, 
                                                                                               0.941860465116279, 0.973333333333333, 0.866666666666667), chr1_1021001 = c(0.286713286713287, 
                                                                                                                                                                          0.5625, 0.606741573033708, 0.509433962264151, 0.233766233766234, 
                                                                                                                                                                          0.402684563758389, 0.696, 0.454545454545455, 0.495934959349593, 
                                                                                                                                                                          0.381642512077295, 0.444444444444444, 0.580645161290323, 0.443708609271523, 
                                                                                                                                                                          0.482269503546099, 0.595890410958904, 0.422535211267606, 0.397435897435897, 
                                                                                                                                                                          0.7), chr1_102234501 = c(0.782312925170068, 0.857142857142857, 
                                                                                                                                                                                                   0.884057971014493, 0.979020979020979, 0.916666666666667, 0.930555555555556, 
                                                                                                                                                                                                   0.976190476190476, 0.939393939393939, 0.935064935064935, 0.9, 
                                                                                                                                                                                                   0.939024390243902, 0.966666666666667, 0.953125, 0.920634920634921, 
                                                                                                                                                                                                   0.961165048543689, 0.848484848484849, 1, 1), chr1_1032001 = c(0.846153846153846, 
                                                                                                                                                                                                                                                                 0.666666666666667, 0.72, 0.701492537313433, 0.423076923076923, 
                                                                                                                                                                                                                                                                 0.526315789473684, 0.658536585365854, 0.729166666666667, 0.659090909090909, 
                                                                                                                                                                                                                                                                 0.55, 0.46875, 0.586206896551724, 0.574074074074074, 0.621212121212121, 
                                                                                                                                                                                                                                                                 0.597014925373134, 0.617647058823529, 0.777777777777778, 0.7),chr1_10399501 = c(0.210045662100457, 0.294117647058824, 0.0393518518518519, 
                                                                                                                                                                                                                                                                                                                                                 0.0558739255014327, 0.123737373737374, 0.0852994555353902, 
                                                                                                                                                                                                                                                                                                                                                 0.0453216374269006, 0.106451612903226, 0.119584055459272, 
                                                                                                                                                                                                                                                                                                                                                 0.0448275862068966, 0.0467741935483871, 0.0831643002028398, 
                                                                                                                                                                                                                                                                                                                                                 0.0418604651162791, 0.0635451505016722, 0.161290322580645), chr1_10400501 = c(0.634020618556701, 0.794117647058823, 
                                        0.895833333333333, 0.833333333333333, 0.928571428571429, 
                                        0.96969696969697, 0.892857142857143, 0.897435897435897, 0.858585858585859, 
                                        0.869565217391304, 0.9, 0.879194630872483, 0.875, 0.916666666666667, 
                                        0.911111111111111, 0.875, 0.8, 0.848484848484849), chr1_104933001 = c(0.892215568862275, 
                                                                                                              1, 0.347826086956522, 0.365853658536585, 0.603174603174603, 
                                                                                                              0.717948717948718, 0.632911392405063, 0, 0.472727272727273, 
                                                                                                              0.4, 0.478260869565217, 0.646153846153846, 0.540229885057471, 
                                                                                                              0.723076923076923, 0.574712643678161, 0.760869565217391, 
                                                                                                              0.4375, 0.285714285714286), chr1_10530501 = c(0.528301886792453, 
                                                                                                                                                            0.849056603773585, 0.803030303030303, 0.821782178217822, 
                                                                                                                                                            0.888888888888889, 0.827586206896552, 0.773584905660377, 
                                                                                                                                                            0.8125, 0.925373134328358, 0.849462365591398, 0.921052631578947, 
                                                                                                                                                            0.86046511627907, 0.866666666666667, 0.75, 0.911764705882353, 
                                                                                                                                                            0.857142857142857, 0.833333333333333, 0.914634146341463), 
                   chr1_105850

<details>
<summary>英文:</summary>

Here is my reproduceable data and code:

    dd&lt;-structure(list(chr1_1005501 = c(0.597222222222222, 0.75, 0.775, 
                                    0.732456140350877, 0.860696517412935, 0.777777777777778, 0.654545454545455, 
                                    0.811224489795918, 0.742424242424242, 0.806146572104019, 0.828402366863905, 
                                    0.718181818181818, 0.85663082437276, 0.853868194842407, 0.849624060150376, 
                                    0.759615384615385, 0.798611111111111, 1), chr1_1018001 = c(0.615384615384615, 
                                                                                               1, 0.95959595959596, 0.976415094339623, 0.979166666666667, 0.920634920634921, 
                                                                                               0.981481481481482, 1, 0.94578313253012, 0.967213114754098, 0.986486486486487, 
                                                                                               0.952380952380952, 0.965116279069767, 0.986238532110092, 0.975206611570248, 
                                                                                               0.941860465116279, 0.973333333333333, 0.866666666666667), chr1_1021001 = c(0.286713286713287, 
                                                                                                                                                                          0.5625, 0.606741573033708, 0.509433962264151, 0.233766233766234, 
                                                                                                                                                                          0.402684563758389, 0.696, 0.454545454545455, 0.495934959349593, 
                                                                                                                                                                          0.381642512077295, 0.444444444444444, 0.580645161290323, 0.443708609271523, 
                                                                                                                                                                          0.482269503546099, 0.595890410958904, 0.422535211267606, 0.397435897435897, 
                                                                                                                                                                          0.7), chr1_102234501 = c(0.782312925170068, 0.857142857142857, 
                                                                                                                                                                                                   0.884057971014493, 0.979020979020979, 0.916666666666667, 0.930555555555556, 
                                                                                                                                                                                                   0.976190476190476, 0.939393939393939, 0.935064935064935, 0.9, 
                                                                                                                                                                                                   0.939024390243902, 0.966666666666667, 0.953125, 0.920634920634921, 
                                                                                                                                                                                                   0.961165048543689, 0.848484848484849, 1, 1), chr1_1032001 = c(0.846153846153846, 
                                                                                                                                                                                                                                                                 0.666666666666667, 0.72, 0.701492537313433, 0.423076923076923, 
                                                                                                                                                                                                                                                                 0.526315789473684, 0.658536585365854, 0.729166666666667, 0.659090909090909, 
                                                                                                                                                                                                                                                                 0.55, 0.46875, 0.586206896551724, 0.574074074074074, 0.621212121212121, 
                                                                                                                                                                                                                                                                 0.597014925373134, 0.617647058823529, 0.777777777777778, 0.7),chr1_10399501 = c(0.210045662100457, 0.294117647058824, 0.0393518518518519, 
                                                                                                                                                                                                                                                                                                                                                 0.0558739255014327, 0.123737373737374, 0.0852994555353902, 
                                                                                                                                                                                                                                                                                                                                                 0.0453216374269006, 0.106451612903226, 0.119584055459272, 
                                                                                                                                                                                                                                                                                                                                                 0.0448275862068966, 0.0467741935483871, 0.0831643002028398, 
                                                                                                                                                                                                                                                                                                                                                 0.0418604651162791, 0.0635451505016722, 0.161290322580645), chr1_10400501 = c(0.634020618556701, 0.794117647058823, 
                                        0.895833333333333, 0.833333333333333, 0.928571428571429, 
                                        0.96969696969697, 0.892857142857143, 0.897435897435897, 0.858585858585859, 
                                        0.869565217391304, 0.9, 0.879194630872483, 0.875, 0.916666666666667, 
                                        0.911111111111111, 0.875, 0.8, 0.848484848484849), chr1_104933001 = c(0.892215568862275, 
                                                                                                              1, 0.347826086956522, 0.365853658536585, 0.603174603174603, 
                                                                                                              0.717948717948718, 0.632911392405063, 0, 0.472727272727273, 
                                                                                                              0.4, 0.478260869565217, 0.646153846153846, 0.540229885057471, 
                                                                                                              0.723076923076923, 0.574712643678161, 0.760869565217391, 
                                                                                                              0.4375, 0.285714285714286), chr1_10530501 = c(0.528301886792453, 
                                                                                                                                                            0.849056603773585, 0.803030303030303, 0.821782178217822, 
                                                                                                                                                            0.888888888888889, 0.827586206896552, 0.773584905660377, 
                                                                                                                                                            0.8125, 0.925373134328358, 0.849462365591398, 0.921052631578947, 
                                                                                                                                                            0.86046511627907, 0.866666666666667, 0.75, 0.911764705882353, 
                                                                                                                                                            0.857142857142857, 0.833333333333333, 0.914634146341463), 
                   chr1_1058501 = c(0, 0.0434782608695652, 0.520833333333333, 
                                    0.444444444444444, 0.370967741935484, 0.309859154929577, 
                                    0.220338983050847, 0.477272727272727, 0.344827586206897, 
                                    0.524590163934426, 0.294117647058824, 0.168067226890756, 
                                    0.338983050847458, 0.446153846153846, 0.39344262295082, 0.444444444444444, 
                                    0.319148936170213, 0.136363636363636), Sample = structure(c(1L, 
                                                                                                1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 
                                                                                                6L), levels = c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;F&quot;), class = &quot;factor&quot;)), row.names = c(&quot;A1&quot;, 
                                                                                                                                                                                 &quot;A2&quot;, &quot;B1&quot;, &quot;B2&quot;, &quot;C1&quot;, 
                                                                                                                                                                                 &quot;C2&quot;, &quot;D1&quot;, &quot;D2&quot;, &quot;E1&quot;, 
                                                                                                                                                                                 &quot;E2&quot;, &quot;E3&quot;, &quot;F1&quot;, &quot;F2&quot;, 
                                                                                                                                                                                 &quot;F3&quot;, &quot;F4&quot;, &quot;F5&quot;, &quot;F6&quot;, 
                                                                                                                                                                                 &quot;F7&quot;), class = &quot;data.frame&quot;)
           
    dd
    pca_result &lt;- prcomp(dd[,-ncol(dd)],
                             scale=T,  # 
                             center = T
                             #num.cores = 30
        )
    Sample=factor(c(rep(&quot;A&quot;,2),
                        rep(&quot;B&quot;,2),
                        rep(&quot;C&quot;,2),
                        rep(&quot;D&quot;,2),
                        rep(&quot;E&quot;,3),
                        rep(&quot;F&quot;,7)),levels = unique(dd$Sample))
        
        
    ggbiplot(pca_result, 
                         var.axes=F,            # 
                         obs.scale = 1,         # 
                         var.scale = 1,
                         groups = Sample,  # 
                         ellipse = F,           # 
                         #ellipse.prob = 0.67,
                         circle = F)+ 
          geom_point(aes(color = Sample), size = 1) +
          #geom_encircle(aes(fill=Sample), alpha = 0.3, show.legend = F,colour=NA) +
          ggforce::geom_mark_ellipse(aes(fill = Sample,
                                         color = Sample))

##############

I made a pca plot and then I want to add confidence ellipse by **ggplot2 or ggforce or factoextra or  ggbiplot package.**

But there is a small problem that there are 6 groups but some of them have only two samples,

So I tried ggplot2 or factoextra and set related parameters:

such as: in ggbiplot

    ellipse.type = &quot;confidence&quot;, ellipse.level = 0.95

Or in factoextra:
I used fviz_pca_biplot() or fviz_pca_ind():

    fviz_pca_biplot(pca_result, 
                    col.ind = Sample,
                    addEllipses = T,
                    ellipse.type = &quot;confidence&quot;,
                    ellipse.level = 0.95,
                    col.var = &quot;black&quot;,
                    repel = TRUE )
        


But it reminds me with:   `Too few points to calculate an ellipse`

So, I tried the last choice: ggforce.

      +ggforce::geom_mark_ellipse(aes(fill = Sample,
                                     color = Sample))
It works well and I got beautiful figure.

**However, my question is I am not sure if the regions in my plot mean confidence intervel or other meaning that I don&#39;t know.**
Many packages like factoextra have said that circles produced by it means confidence interval.
But I didn&#39;t see similar description like it in ggforce package.

[![enter image description here][1]][1]
So could somebody give me some advice or explanation. 
Thanks in advance.


  [1]: https://i.stack.imgur.com/KJj6K.png

</details>


# 答案1
**得分**: 1

`geom_mark_ellipse`中的省略号只是绘制了能够包围每个组中的点的可能最小椭圆。如帮助文件所述:
&gt; 用Khachiyan算法估算出包围的椭圆

Khachiyan算法是一种确定能够包围一组点的最小体积椭圆的方法,在二维空间(如您的双标图中),这意味着能够覆盖所有点的最小椭圆。

`geom_mark_ellipse`实际上是用于注释的,因为默认情况下,它会将椭圆扩展5mm,并允许在每个椭圆内添加标签。这就是为什么不会立即显而易见地看到该函数是通过找到最小包围椭圆来工作的原因。如果您仔细查看您的图表,您会注意到由于这个固定距离扩展,一些组不是真正的椭圆。

您可以通过设置`expand = 0`来获得实际的最小椭圆,这时可以明显看出椭圆只是包围所有点的最小椭圆:

``` r
ggbiplot(pca_result, 
         var.axes = FALSE,
         obs.scale = 1,   
         var.scale = 1,
         groups = Sample, 
         ellipse = FALSE,         
         circle = FALSE)+ 
  geom_point(aes(color = Sample), size = 1) +
  ggforce::geom_mark_ellipse(aes(fill = Sample,
                                 color = Sample), expand = unit(0, 'mm'),
                             tol = 0.001)

在这里,我们可以看到椭圆与每个组中最外层的点重合。当只有两个点时,结果可能只是一个零宽度的椭圆(即一条直线),但似乎内置了最大长宽比以保持明显的椭圆形状以美观为目的。

英文:

The ellipses in geom_mark_ellipse simply draws the smallest possible ellipse that can enclose the points in each group. As the help file states:
> The enclosing ellipses are estimated using the Khachiyan algorithm

The Khachiyan algorithm is a method for determining the smallest volume ellipsoid that can enclose a group of points, and in a 2D space such as your biplot, this means the smallest ellipse that can cover all the points.

geom_mark_ellipse is really intended for annotations, since by default it will expand the ellipse by 5mm and allows for labels within each ellipse. This is why it is not immediately apparent that the function works by finding the minimum enclosing ellipse. If you look carefully at your plot, you will notice that some of the groups are not true ellipses due to this fixed - distance expansion.

You can get the actual smallest ellipse by setting expand = 0, where it becomes apparent that the ellipse is simply the smallest ellipse that encloses all the points:

ggbiplot(pca_result, 
         var.axes = FALSE,
         obs.scale = 1,   
         var.scale = 1,
         groups = Sample, 
         ellipse = FALSE,         
         circle = FALSE)+ 
  geom_point(aes(color = Sample), size = 1) +
  ggforce::geom_mark_ellipse(aes(fill = Sample,
                                 color = Sample), expand = unit(0, &#39;mm&#39;),
                             tol = 0.001)

ggforce包中geom_mark_ellipse函数生成的区域是什么?

Here we can see that the ellipses are coincident with the outermost points in each group. Where there are only two points, the result should probably just be a zero-width ellipse (i.e. a straight line), but there seems to be a maximum aspect ratio built in to keep the shape obviously elliptical for aesthetic purposes.

huangapple
  • 本文由 发表于 2023年6月15日 18:13:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76481466.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定