如何在R中同时从两列中抽样?

huangapple go评论106阅读模式
英文:

How to sample simultaneously from two columns in R?

问题

我试图从df_sample中绘制param1param2的样本到df。我尝试使用setDT函数,如其他地方建议的,以输入两列,但输出只识别第一个param1。有关如何修复此问题的建议吗?

注意:param1param2的值应独立抽样。例如,在站点1四分之一1中,可以是5.57和499.04配对,也可以是3.56和499.37配对,但最好也可以交换这些值。因此,我希望独立抽样param1param2

  1. #数据框
  2. set.seed(501)
  3. month <- rep(c("J","J","J","F","M"), each = 5)
  4. site <- rep(c("1","2","3","1","2"), each = 5)
  5. quad <- rep(c("1","2","3","4","5"), rep = 5)
  6. df <- data.frame(month,site,quad)
  7. site <- rep(c("1","2","3"), each = 20)
  8. quad <- c("1","2","3","4","5","1","2","3","4","5","1","2","3","4","5","1","2","3","4","5","1","2","3","4","5","1","2","3","4","5","2","2","3","4","5","1","2","3","4","5","1","2","3","4","5","1","2","3","4","5","1","1","3","4","5","1","1","3","4","5")
  9. param1 <- rnorm(60,5,1)
  10. param2 <- rnorm(60,500,1)
  11. df_sample <- data.frame(site,quad, param1, param2)
  12. library(dplyr)
  13. library(data.table)
  14. df <- setDT(df_sample)[,list(param=list(param1, param2)),by=list(site,quad)][
  15. setDT(df),
  16. on = c("site","quad")][,param:=sapply(param, sample, 1)][]
英文:

I am trying to draw samples param1 and param2 from df_sample to df. I tried to use the setDT function as suggested elsewhere, to input two columns..but the output only recognizes the first param1. Any suggestions on how to fix this?

NOTE: The param1 and param2 values should be sampled independently.
For e.g., in site 1 quad 1, it is either the 5.57 and 499.04 pair OR 3.56 and 499.37 pair..but ideally those could be swapped as well. So I'm looking to sample param1 and param2 independently.

  1. #DATA FRAMES
  2. set.seed(501)
  3. month &lt;- rep(c(&quot;J&quot;,&quot;J&quot;,&quot;J&quot;,&quot;F&quot;,&quot;M&quot;), each = 5)
  4. site &lt;- rep(c(&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;1&quot;,&quot;2&quot;), each = 5)
  5. quad &lt;- rep(c(&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;), rep = 5)
  6. df &lt;- data.frame(month,site,quad)
  7. site &lt;- rep(c(&quot;1&quot;,&quot;2&quot;,&quot;3&quot;), each = 20)
  8. quad &lt;- c(&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;2&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;1&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;1&quot;,&quot;1&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;)
  9. param1 &lt;- rnorm(60,5,1)
  10. param2 &lt;- rnorm(60,500,1)
  11. df_sample &lt;- data.frame(site,quad, param1, param2)
  12. library(dplyr)
  13. library(data.table)
  14. df &lt;- setDT(df_sample)[,list(param=list(param1, param2)),by=list(site,quad)][
  15. setDT(df),
  16. on = c(&quot;site&quot;,&quot;quad&quot;)][,param:=sapply(param, sample, 1)][]

答案1

得分: 2

你可以像下面这样做:

  1. library(dplyr) #需要版本1.1.0或更高以支持`.by`
  2. set.seed(1)
  3. df_sample <-
  4. df_sample %>%
  5. mutate(id = row_number(),
  6. max = max(id), .by = c(site, quad))
  7. df %>%
  8. left_join(distinct(df_sample, site, quad, max)) %>%
  9. mutate(id = sapply(max, sample, size = 1)) %>%
  10. left_join(df_sample)
  1. month site quad max id param1 param2
  2. 1 J 1 1 4 1 5.577281 499.0493
  3. 2 J 1 2 4 4 4.813692 499.5953
  4. 3 J 1 3 4 3 5.237906 500.0975
  5. 4 J 1 4 4 1 5.225781 500.9566
  6. 5 J 1 5 4 2 6.223762 501.5960
  7. 6 J 2 1 3 1 6.151618 500.9271
  8. 7 J 2 2 5 3 5.622492 499.7996
  9. 8 J 2 3 4 2 4.479705 501.0895
  10. 9 J 2 4 4 2 3.680826 500.7056
  11. 10 J 2 5 4 3 6.491763 498.2079
  12. 11 J 3 1 6 3 6.708715 498.4764
  13. 12 J 3 2 2 1 5.360314 499.2427
  14. 13 J 3 3 4 1 6.091806 499.8522
  15. 14 J 3 4 4 1 5.515952 500.7180
  16. 15 J 3 5 4 2 7.633465 499.8845
  17. 16 F 1 1 4 2 3.566631 499.3791
  18. 17 F 1 2 4 2 3.351793 500.1388
  19. 18 F 1 3 4 2 5.035623 500.4443
  20. 19 F 1 4 4 3 5.465357 499.9017
  21. 20 F 1 5 4 1 4.155195 501.7882
  22. 21 M 2 1 3 3 5.519431 499.2223
  23. 22 M 2 2 5 5 4.725804 500.7145
  24. 23 M 2 3 4 1 5.151815 499.6472
  25. 24 M 2 4 4 1 5.455574 500.3809
  26. 25 M 2 5 4 1 4.721432 499.4773
英文:

You can do something like the following:

  1. library(dplyr) #1.1.0 or above needed for `.by`
  2. set.seed(1)
  3. df_sample &lt;-
  4. df_sample %&gt;%
  5. mutate(id = row_number(),
  6. max = max(id), .by = c(site, quad))
  7. df %&gt;%
  8. left_join(distinct(df_sample, site, quad, max)) %&gt;%
  9. mutate(id = sapply(max, sample, size = 1)) %&gt;%
  10. left_join(df_sample)
  1. month site quad max id param1 param2
  2. 1 J 1 1 4 1 5.577281 499.0493
  3. 2 J 1 2 4 4 4.813692 499.5953
  4. 3 J 1 3 4 3 5.237906 500.0975
  5. 4 J 1 4 4 1 5.225781 500.9566
  6. 5 J 1 5 4 2 6.223762 501.5960
  7. 6 J 2 1 3 1 6.151618 500.9271
  8. 7 J 2 2 5 3 5.622492 499.7996
  9. 8 J 2 3 4 2 4.479705 501.0895
  10. 9 J 2 4 4 2 3.680826 500.7056
  11. 10 J 2 5 4 3 6.491763 498.2079
  12. 11 J 3 1 6 3 6.708715 498.4764
  13. 12 J 3 2 2 1 5.360314 499.2427
  14. 13 J 3 3 4 1 6.091806 499.8522
  15. 14 J 3 4 4 1 5.515952 500.7180
  16. 15 J 3 5 4 2 7.633465 499.8845
  17. 16 F 1 1 4 2 3.566631 499.3791
  18. 17 F 1 2 4 2 3.351793 500.1388
  19. 18 F 1 3 4 2 5.035623 500.4443
  20. 19 F 1 4 4 3 5.465357 499.9017
  21. 20 F 1 5 4 1 4.155195 501.7882
  22. 21 M 2 1 3 3 5.519431 499.2223
  23. 22 M 2 2 5 5 4.725804 500.7145
  24. 23 M 2 3 4 1 5.151815 499.6472
  25. 24 M 2 4 4 1 5.455574 500.3809
  26. 25 M 2 5 4 1 4.721432 499.4773

huangapple
  • 本文由 发表于 2023年3月3日 20:23:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75627031.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定