创建二进制表,其中行数是来自变量的值。

huangapple go评论126阅读模式
英文:

Create binary table where the number of rows are values from a variable

问题

我有一个包含两个计数变量(var1和var2)的数据集,它们代表两个时间点的存活个体数。
我需要创建一个包含新的二进制变量的表格,其中行数对应于var1的值,并在这个变量内,根据var2的值分配"1"(例如,如果var1中有10,var2中有3,那么我需要10行,其中3行为"1",其余为"0")。
是否有任何可以帮助的函数?
谢谢。

英文:

I have a dataset with a two variables with counts (var1 and var2), that represent alive individuals at two times.
I need to create a table with a new binary variable, where the number of rows correspond to the values of var1, and within this variable, I need to assign "1" depending on values of var2 (e.g. if there is a 10 in var1 and a 3 in var2, then I would need 10 rows, from which 3 rows have "1", and the rest have "0").
Any function that would help?
Thanks

答案1

得分: 1

我可能误解了你的需求,但以下是我做的:

  1. exemple <- data.frame(id = 1:3, var1 = c(10, 5, 1), var2 = c(3, 3, 0))
  2. apply(exemple, 1, \(x) data.frame(id = rep(x[["id"]], times = x[["var1"]]),
  3. var_binaire = rep(c(1, 0), times = c(x[["var2"]], x[["var1"]] - x[["var2"]])))) |>
  4. do.call(what = "rbind")

它似乎可以达到效果,它使用了rep函数重复生成1和0。apply被用于在行方向上执行操作。因此apply在此处创建了一个列表,每个元素都是一个具有var1行和var_binaire = 1 var2次的数据框。然后,do.call("rbind")将它们合并成一个数据框。以下是结果:

  1. id var_binaire
  2. 1 1 1
  3. 2 1 1
  4. 3 1 1
  5. 4 1 0
  6. 5 1 0
  7. 6 1 0
  8. 7 1 0
  9. 8 1 0
  10. 9 1 0
  11. 10 1 0
  12. 11 2 1
  13. 12 2 1
  14. 13 2 1
  15. 14 2 0
  16. 15 2 0
  17. 16 3 0

我曾以为可以在不使用apply的情况下完成,但我无法想出方法。也许会有更好的解决方案出现。希望对你有所帮助。

英文:

I may have misunderstood what you wanted but here is what I did:

  1. exemple &lt;- data.frame(id = 1:3, var1 = c(10, 5, 1), var2 = c(3, 3, 0))
  2. apply(exemple, 1, \(x) data.frame(id = rep(x[[&quot;id&quot;]], times = x[[&quot;var1&quot;]]),
  3. var_binaire = rep(c(1, 0), times = c(x[[&quot;var2&quot;]], x[[&quot;var1&quot;]] - x[[&quot;var2&quot;]])))) |&gt;
  4. do.call(what = &quot;rbind&quot;)

It seems to do the trick, it creates a data.frame by repeating 1's and 0's with function rep. Apply is used with margin = 1 to perform rowwise operations. Hence apply create here a list with each element a data.frame with var1 lines and var_binaire = 1 var2 times. Then, do.call(&quot;rbind&quot;) combines all in a data.frame. Here is the result:

  1. exemple
  2. id var1 var2
  3. 1 1 10 3
  4. 2 2 5 3
  5. 3 3 1 0
  6. output
  7. id var_binaire
  8. 1 1 1
  9. 2 1 1
  10. 3 1 1
  11. 4 1 0
  12. 5 1 0
  13. 6 1 0
  14. 7 1 0
  15. 8 1 0
  16. 9 1 0
  17. 10 1 0
  18. 11 2 1
  19. 12 2 1
  20. 13 2 1
  21. 14 2 0
  22. 15 2 0
  23. 16 3 0

I thought it would be possible without apply, but I couldn't figure out how. Maybe a better solution will come up. I hope this helped.

答案2

得分: 1

与@Guillaume Mulier非常相似的一种方法,但使用向量化:

  1. x <- data.frame(id = 1:3, var1 = c(10, 5, 1), var2 = c(3, 3, 0))
  2. with(x, data.frame(id=rep(id, var1),
  3. mort=rep(rep(1:0, nrow(x)),
  4. c(matrix(c(var2, var1 - var2), 2, byrow=TRUE)))))
  5. # id mort
  6. #1 1 1
  7. #2 1 1
  8. #3 1 1
  9. #4 1 0
  10. #5 1 0
  11. #6 1 0
  12. #7 1 0
  13. #8 1 0
  14. #9 1 0
  15. #10 1 0
  16. #11 2 1
  17. #12 2 1
  18. #13 2 1
  19. #14 2 0
  20. #15 2 0
  21. #16 3 0

请注意,这段代码是R语言的一部分,主要用于数据框(data.frame)操作。

英文:

A way quite similar to @Guillaume Mulier but vectorizing;

  1. x &lt;- data.frame(id = 1:3, var1 = c(10, 5, 1), var2 = c(3, 3, 0))
  2. with(x, data.frame(id=rep(id, var1),
  3. mort=rep(rep(1:0, nrow(x)),
  4. c(matrix(c(var2, var1 - var2), 2, byrow=TRUE)))))
  5. # id mort
  6. #1 1 1
  7. #2 1 1
  8. #3 1 1
  9. #4 1 0
  10. #5 1 0
  11. #6 1 0
  12. #7 1 0
  13. #8 1 0
  14. #9 1 0
  15. #10 1 0
  16. #11 2 1
  17. #12 2 1
  18. #13 2 1
  19. #14 2 0
  20. #15 2 0
  21. #16 3 0

huangapple
  • 本文由 发表于 2023年6月22日 19:21:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/76531362.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定