为什么在Rstudio中使用barplot时我会收到错误消息“名称数量不正确”?

huangapple go评论101阅读模式
英文:

Why I´m getting this error using barplot in Rstudio "incorrect number of names"?

问题

  1. 我正在尝试在Rstudio中为祖先制作一个条形图,但出现了错误"incorrect number of names"
  2. 这是我运行的代码:
  3. ```{r}
  4. # Cargar los datos
  5. datos <- read.table("admix.txt", header = TRUE)
  6. # Definir los nombres de las poblaciones
  7. Poblacion <- c("America","Europa","Eurasia","EurAm","Europasia")
  8. # Crear el gráfico de Admixture
  9. barplot(t(as.matrix(datos[, -(1)])), col = rainbow(ncol(datos)-1),
  10. xlab = "Individuo", ylab = "Proporción", names.arg = Poblacion)

我的数据集:

...

origen 1 和 2 是使用混合得到的每个个体的祖先比例

我尝试了上面显示的代码,我期望得到这个:

为什么在Rstudio中使用barplot时我会收到错误消息“名称数量不正确”?

  1. <details>
  2. <summary>英文:</summary>
  3. I&#180;m trying to make a barplot for ancestry in Rstudio, an got the error &quot;incorrect number of names&quot;
  4. This is the code I run:
  5. ```{r}
  6. # Cargar los datos
  7. datos &lt;- read.table(&quot;admix.txt&quot;, header = TRUE)
  8. # Definir los nombres de las poblaciones
  9. Poblacion &lt;- c(&quot;America&quot;,&quot;Europa&quot;,&quot;Eurasia&quot;,&quot;EurAm&quot;,&quot;Europasia&quot;)
  10. # Crear el gr&#225;fico de Admixture
  11. barplot(t(as.matrix(datos[, -(1)])), col = rainbow(ncol(datos)-1),
  12. xlab = &quot;Individuo&quot;, ylab = &quot;Proporci&#243;n&quot;, names.arg = Poblacion)

My dataset:

  1. Poblacion origen1 origen2
  2. America 0.006666 0.993334
  3. America 0.779961 0.220039
  4. America 0.427611 0.572389
  5. America 0.813640 0.186360
  6. America 0.652604 0.347396
  7. America 0.499865 0.500135
  8. America 0.290712 0.709288
  9. America 0.447847 0.552153
  10. America 0.840954 0.159046
  11. America 0.523092 0.476908
  12. America 0.000010 0.999990
  13. America 0.143286 0.856714
  14. America 0.472235 0.527765
  15. America 0.771131 0.228869
  16. America 0.511068 0.488932
  17. America 0.025474 0.974526
  18. America 0.000010 0.999990
  19. America 0.005296 0.994704
  20. America 0.685525 0.314475
  21. America 0.418856 0.581144
  22. America 0.653668 0.346332
  23. America 0.225173 0.774827
  24. America 0.383285 0.616715
  25. America 0.058886 0.941114
  26. America 0.009342 0.990658
  27. America 0.015007 0.984993
  28. America 0.002664 0.997336
  29. America 0.000010 0.999990
  30. America 0.145986 0.854014
  31. America 0.000010 0.999990
  32. America 0.015244 0.984756
  33. America 0.000010 0.999990
  34. America 0.000010 0.999990
  35. America 0.167392 0.832608
  36. America 0.640400 0.359600
  37. EurAm 0.000648 0.999352
  38. EurAm 0.255487 0.744513
  39. EurAm 0.000010 0.999990
  40. EurAm 0.450210 0.549790
  41. EurAm 0.000010 0.999990
  42. EurAm 0.546981 0.453019
  43. EurAm 0.484598 0.515402
  44. EurAm 0.086021 0.913979
  45. EurAm 0.285348 0.714652
  46. EurAm 0.031093 0.968907
  47. EurAm 0.069430 0.930570
  48. EurAm 0.037918 0.962082
  49. EurAm 0.022321 0.977679
  50. EurAm 0.320998 0.679002
  51. EurAm 0.106400 0.893600
  52. EurAm 0.048877 0.951123
  53. EurAm 0.182298 0.817702
  54. EurAm 0.031725 0.968275
  55. EurAm 0.312833 0.687167
  56. EurAm 0.457584 0.542416
  57. EurAm 0.054852 0.945148
  58. EurAm 0.553960 0.446040
  59. EurAm 0.002580 0.997420
  60. EurAm 0.025126 0.974874
  61. EurAm 0.999990 0.000010
  62. EurAm 0.000010 0.999990
  63. EurAm 0.147882 0.852118
  64. EurAm 0.000010 0.999990
  65. EurAm 0.221932 0.778068
  66. EurAm 0.181649 0.818351
  67. EurAm 0.595149 0.404851
  68. EurAm 0.681347 0.318653
  69. EurAm 0.000010 0.999990
  70. EurAm 0.702988 0.297012
  71. EurAm 0.000010 0.999990
  72. EurAm 0.002774 0.997226
  73. Eurasia 0.005494 0.994506
  74. Eurasia 0.000010 0.999990
  75. Eurasia 0.019013 0.980987
  76. Eurasia 0.019751 0.980249
  77. Eurasia 0.023125 0.976875
  78. Eurasia 0.335525 0.664475
  79. Eurasia 0.019229 0.980771
  80. Eurasia 0.028028 0.971972
  81. Eurasia 0.000010 0.999990
  82. Eurasia 0.667998 0.332002
  83. Eurasia 0.000010 0.999990
  84. Europa 0.021506 0.978494
  85. Europa 0.085614 0.914386
  86. Europa 0.002423 0.997577
  87. Europa 0.136019 0.863981
  88. Europa 0.000010 0.999990
  89. Europa 0.001705 0.998295
  90. Europa 0.008959 0.991041
  91. Europa 0.005611 0.994389
  92. Europa 0.000010 0.999990
  93. Europa 0.000010 0.999990
  94. Europa 0.011926 0.988074
  95. Europa 0.685324 0.314676
  96. Europa 0.026084 0.973916
  97. Europa 0.000010 0.999990
  98. Europa 0.016599 0.983401
  99. Europa 0.007035 0.992965
  100. Europa 0.132058 0.867942
  101. Europa 0.005673 0.994327
  102. Europa 0.000010 0.999990
  103. Europa 0.007433 0.992567
  104. Europa 0.022336 0.977664
  105. Europa 0.000010 0.999990
  106. Europa 0.076555 0.923445
  107. Europa 0.205925 0.794075
  108. Europa 0.023510 0.976490
  109. Europa 0.003213 0.996787
  110. Europa 0.000010 0.999990
  111. Europa 0.000010 0.999990
  112. Europa 0.020198 0.979802
  113. Europa 0.000010 0.999990
  114. Europa 0.174797 0.825203
  115. Europa 0.130237 0.869763
  116. Europa 0.128710 0.871290
  117. Europa 0.015761 0.984239
  118. Europa 0.016476 0.983524
  119. Europa 0.016811 0.983189
  120. Europa 0.000863 0.999137
  121. Europa 0.162520 0.837480
  122. Europa 0.000010 0.999990
  123. Europa 0.004684 0.995316
  124. Europa 0.019208 0.980792
  125. Europa 0.492487 0.507513
  126. Europa 0.000010 0.999990
  127. Europa 0.000010 0.999990
  128. Europa 0.015666 0.984334
  129. Europa 0.000010 0.999990
  130. Europa 0.018586 0.981414
  131. Europa 0.228070 0.771930
  132. Europa 0.054701 0.945299
  133. Europa 0.015723 0.984277
  134. Europa 0.000010 0.999990
  135. Europa 0.147377 0.852623
  136. Europa 0.000010 0.999990
  137. Europa 0.015433 0.984567
  138. Europa 0.194324 0.805676
  139. Europa 0.142146 0.857854
  140. Europa 0.181220 0.818780
  141. Europa 0.003677 0.996323
  142. Europa 0.355231 0.644769
  143. Europa 0.402608 0.597392
  144. Europa 0.067520 0.932480
  145. Europa 0.171952 0.828048
  146. Europa 0.014737 0.985263
  147. Europa 0.000010 0.999990
  148. Europa 0.003896 0.996104
  149. Europa 0.000010 0.999990
  150. Europa 0.795202 0.204798
  151. Europa 0.006578 0.993422
  152. Europa 0.021397 0.978603
  153. Europa 0.145587 0.854413
  154. Europa 0.062430 0.937570
  155. Europa 0.000010 0.999990
  156. Europa 0.012280 0.987720
  157. Europa 0.999990 0.000010
  158. Europa 0.020080 0.979920
  159. Europa 0.134631 0.865369
  160. Europasia 0.023057 0.976943
  161. Europasia 0.000010 0.999990
  162. Europasia 0.016871 0.983129
  163. Europasia 0.058525 0.941475

origen 1 and 2 are the proportions of ancestry for each individual obtained with admixture

I tried the code I showed you above,
I expect this:
为什么在Rstudio中使用barplot时我会收到错误消息“名称数量不正确”?

答案1

得分: 0

由于您的 Poblacion 向量只有长度为5,barplot 不知道您希望 names.arg 在每个个体中正确重复。我可以建议您改用 ggplot2 吗?这可能不是最有效或最优雅的解决方案,但可以让您离您想要的目标更近一步。

我读取了您的数据,并将其命名为 df

  1. # 如果您尚未安装这些包,请安装它们
  2. library(forcats)
  3. library(tidyverse)
  4. library(reshape2)
  5. df$Poblacion <- factor(df$Poblacion)
  6. df$ID <- as.numeric(row.names(df))
  7. # 为 ggplot 重塑数据集
  8. melt.df <- melt(df, id.vars = c("ID","Poblacion"), value.name = "percentage")
  9. # 这是为了颠倒因子水平,因为在这里描述的排序 geom_bar 和图例存在问题
  10. melt.df$reassigned.origen <- "origen1"
  11. melt.df$reassigned.origen[melt.df$variable=="origen1"] <- "origen2"
  12. melt.df$ID <- as.numeric(melt.df$ID)
  13. melt.df$reassigned.origen <- factor(melt.df$reassigned.origen)
  14. # > head(melt.df)
  15. # ID Poblacion variable percentage reassigned.origen
  16. # 1 1 America origen1 0.006666 origen2
  17. # 2 2 America origen1 0.779961 origen2
  18. # 3 3 America origen1 0.427611 origen2
  19. # 4 4 America origen1 0.813640 origen2
  20. # 5 5 America origen1 0.652604 origen2
  21. # 6 6 America origen1 0.499865 origen2
  22. # 这是为了标记 x 轴;它计算每个地区的中位数 ID
  23. ID.medians <- aggregate(ID ~ Poblacion, data = melt.df, summary)
  24. ggplot(data=melt.df) +
  25. # 绘制一个堆叠的柱状图,总和为100%;看起来有点奇怪,但可以得到正确的百分比和水平顺序
  26. geom_bar(aes(x=ID, y=percentage, fill = forcats::fct_rev(reassigned.origen),
  27. color=Poblacion), stat="identity") +
  28. # 这删除了图例的标题
  29. theme(legend.title=element_blank()) +
  30. # 在 x 轴上添加地区
  31. scale_x_continuous(breaks = ID.medians$ID[,3],
  32. labels = levels(ID.medians$Poblacion), minor_breaks=NULL) +
  33. # x 轴和 y 轴的标签
  34. labs(x = "", y = "Percentage origen1")

这是我得到的结果图像:
为什么在Rstudio中使用barplot时我会收到错误消息“名称数量不正确”?

您可能需要在颜色、因子水平和格式方面做更多的工作。希望这可以帮助您入门。

英文:

Since your Poblacion vector is only of length 5, barplot does not know that you want names.arg to be repeated correctly per each individual. May I suggest that you use ggplot2 instead? This may not be the most efficient or elegant solution but it gets you a few steps closer to what you want.

I read in your data and called it df.

  1. #install these packages if you don&#39;t already have them installed
  2. library(forcats)
  3. library(tidyverse)
  4. library(reshape2)
  5. df$Poblacion &lt;- factor(df$Poblacion)
  6. df$ID &lt;- as.numeric(row.names(df))
  7. #reshape the data set for ggplot
  8. melt.df &lt;- melt(df, id.vars = c(&quot;ID&quot;,&quot;Poblacion&quot;), value.name = &quot;percentage&quot;)
  9. #this is to reverse the factor levels because there is a problem with ordering
  10. #geom_bar and legends that is too involved to describe here
  11. melt.df$reassigned.origen&lt;-&quot;origen1&quot;
  12. melt.df$reassigned.origen[melt.df$variable==&quot;origen1&quot;]&lt;-&quot;origen2&quot;
  13. melt.df$ID &lt;- as.numeric(melt.df$ID)
  14. melt.df$reassigned.origen &lt;- factor(melt.df$reassigned.origen)
  15. #&gt; head(melt.df)
  16. # ID Poblacion variable percentage reassigned.origen
  17. #1 1 America origen1 0.006666 origen2
  18. #2 2 America origen1 0.779961 origen2
  19. #3 3 America origen1 0.427611 origen2
  20. #4 4 America origen1 0.813640 origen2
  21. #5 5 America origen1 0.652604 origen2
  22. #6 6 America origen1 0.499865 origen2
  23. #this is for labeling the x axis; it calculates the median ID per region
  24. ID.medians &lt;- aggregate(ID ~ Poblacion, data = melt.df, summary)
  25. ggplot(data=melt.df) +
  26. #plot a stacked bar that sums to 100%; it looks strange but
  27. #gets the percentages and levels in the right order
  28. geom_bar(aes(x=ID, y=percentage, fill = forcats::fct_rev(reassigned.origen),
  29. color=Poblacion), stat=&quot;identity&quot;) +
  30. #this removes the title for the legend
  31. theme(legend.title=element_blank()) +
  32. #adds the regions at the x axis
  33. scale_x_continuous(breaks = ID.medians$ID[,3],
  34. labels = levels(ID.medians$Poblacion), minor_breaks=NULL) +
  35. #labels for x and y axes
  36. labs(x = &quot;&quot;, y = &quot;Percentage origen1&quot;)

This is what I get:
为什么在Rstudio中使用barplot时我会收到错误消息“名称数量不正确”?

You will likely have to do quite a bit more work with colors, factor levels, and formatting. Hopefully this will get you started though.

huangapple
  • 本文由 发表于 2023年7月14日 04:38:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76683097.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定