使用`each`还是`times`来处理R中的向量?

huangapple go评论72阅读模式
英文:

Using `each` vs `times` for a vector in R?

问题

在下面的数据集中,

    df_final <- data.frame(month = rep(c(1:12), each =30 * 2),
                               site = rep(c("1","2","3","4","5","6"), times = 60 * 2),
                               quad = rep(c(1:5), times = 72 * 2),
                               seed = rep(c(1:2), times = 360))

.. 每个月(1-12),有6个站点(1-6),有5个象限(1-5) `quad`。因此,一个数据集有12 * 6 * 5 = 360行。在这里,我有两个这样的数据集(360 * 2)。我也可以有1000个这样的数据集。对于每个唯一的数据集,我想分配一个不同的种子。


 [![enter image description here][1]][1]

输出应该如下所示:


    month site quad seed
       1     1    1     1 
       1     1    2     1
       1     1    3     1
       1     1    4     1
       1     1    5     1
       1     1    1     2 
       1     1    2     2
       1     1    3     2
       1     1    4     2
       1     1    5     2

  我尝试使用 `seed = rep(c(1:2), each = 360)`,但那样并不奏效。我该如何得到这个输出?我希望有一种通用的方法,这样我就可以为1000个种子做同样的事情。


  [1]: https://i.stack.imgur.com/RkeUb.png
英文:

In the following dataset,

df_final &lt;- data.frame(month = rep(c(1:12), each =30 * 2),
                           site = rep(c(&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;6&quot;), times = 60 * 2),
                           quad = rep(c(1:5), times = 72 * 2),
                           seed = rep(c(1:2), times = 360))

.. every month (1-12), has 6 sites (1-6), with 5 quadrants (1-5) quad. So one dataset has 12 * 6 * 5 = 360 rows. Here, I have two such datasets (360 * 2). I could also have 1000 such datasets. For every unqiue dataset, I want to assign a different seed.

使用`each`还是`times`来处理R中的向量?

output should looks like:

month site quad seed
   1     1    1     1 
   1     1    2     1
   1     1    3     1
   1     1    4     1
   1     1    5     1
   1     1    1     2 
   1     1    2     2
   1     1    3     2
   1     1    4     2
   1     1    5     2

I tried using seed = rep(c(1:2), each = 360 but that didn't work as well. How can I get this output? I want a general way to do it, so that I can do the same for 1000 seeds.

答案1

得分: 1

You can use cur_group_id:

library(dplyr)
df_final %>% 
  group_by(month, site) %>% 
  mutate(seed = cur_group_id()) %>% 
  ungroup() %>% arrange(month, site)

或者在列排序后使用 consecutive_id

library(dplyr) #1.1.0+
df_final %>% 
  arrange(month, site) %>% 
  mutate(seed = consecutive_id(month, site))
英文:

You can use cur_group_id:

library(dplyr)
df_final %&gt;% 
  group_by(month, site) %&gt;% 
  mutate(seed = cur_group_id()) %&gt;% 
  ungroup() %&gt;% arrange(month, site)

Or consecutive_id after arranging the columns:

library(dplyr) #1.1.0+
df_final %&gt;% 
  arrange(month, site) %&gt;% 
  mutate(seed = consecutive_id(month, site))

答案2

得分: 1

使用 data.table

library(data.table)
setDT(df_final)[order(month, site), seed := rleid(month, site)]

或者

setDT(df_final)[, seed := .GRP, .(month, site)]
英文:

Using data.table

library(data.table)
setDT(df_final)[order(month, site), seed := rleid(month, site)]

Or do

setDT(df_final)[, seed := .GRP, .(month, site)]

huangapple
  • 本文由 发表于 2023年3月7日 01:33:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/75654032.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定