2023年3月7日 01:33:47go评论96阅读模式

英文:

Using `each` vs `times` for a vector in R?

问题

在下面的数据集中，
    df_final <- data.frame(month = rep(c(1:12), each =30 * 2),
                               site = rep(c("1","2","3","4","5","6"), times = 60 * 2),
                               quad = rep(c(1:5), times = 72 * 2),
                               seed = rep(c(1:2), times = 360))
.. 每个月（1-12），有6个站点（1-6），有5个象限（1-5） `quad`。因此，一个数据集有12 * 6 * 5 = 360行。在这里，我有两个这样的数据集（360 * 2）。我也可以有1000个这样的数据集。对于每个唯一的数据集，我想分配一个不同的种子。
 [![enter image description here][1]][1]
输出应该如下所示：
    month site quad seed
       1     1    1     1 
       1     1    2     1
       1     1    3     1
       1     1    4     1
       1     1    5     1
       1     1    1     2 
       1     1    2     2
       1     1    3     2
       1     1    4     2
       1     1    5     2
  我尝试使用 `seed = rep(c(1:2), each = 360)`，但那样并不奏效。我该如何得到这个输出？我希望有一种通用的方法，这样我就可以为1000个种子做同样的事情。
  [1]: https://i.stack.imgur.com/RkeUb.png

英文:

In the following dataset,

df_final &lt;- data.frame(month = rep(c(1:12), each =30 * 2),
                           site = rep(c(&quot;1&quot;,&quot;2&quot;,&quot;3&quot;,&quot;4&quot;,&quot;5&quot;,&quot;6&quot;), times = 60 * 2),
                           quad = rep(c(1:5), times = 72 * 2),
                           seed = rep(c(1:2), times = 360))

.. every month (1-12), has 6 sites (1-6), with 5 quadrants (1-5) quad. So one dataset has 12 * 6 * 5 = 360 rows. Here, I have two such datasets (360 * 2). I could also have 1000 such datasets. For every unqiue dataset, I want to assign a different seed.

output should looks like:

month site quad seed
   1     1    1     1 
   1     1    2     1
   1     1    3     1
   1     1    4     1
   1     1    5     1
   1     1    1     2 
   1     1    2     2
   1     1    3     2
   1     1    4     2
   1     1    5     2

I tried using seed = rep(c(1:2), each = 360 but that didn't work as well. How can I get this output? I want a general way to do it, so that I can do the same for 1000 seeds.

答案1

得分: 1

You can use cur_group_id:

library(dplyr)
df_final %>% 
  group_by(month, site) %>% 
  mutate(seed = cur_group_id()) %>% 
  ungroup() %>% arrange(month, site)

或者在列排序后使用 consecutive_id：

library(dplyr) #1.1.0+
df_final %>% 
  arrange(month, site) %>% 
  mutate(seed = consecutive_id(month, site))

英文:

You can use cur_group_id:

library(dplyr)
df_final %&gt;% 
  group_by(month, site) %&gt;% 
  mutate(seed = cur_group_id()) %&gt;% 
  ungroup() %&gt;% arrange(month, site)

Or consecutive_id after arranging the columns:

library(dplyr) #1.1.0+
df_final %&gt;% 
  arrange(month, site) %&gt;% 
  mutate(seed = consecutive_id(month, site))

答案2

得分: 1

使用 data.table

library(data.table)
setDT(df_final)[order(month, site), seed := rleid(month, site)]

或者

setDT(df_final)[, seed := .GRP, .(month, site)]

英文:

Using data.table

library(data.table)
setDT(df_final)[order(month, site), seed := rleid(month, site)]

Or do

setDT(df_final)[, seed := .GRP, .(month, site)]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用`each`还是`times`来处理R中的向量？

问题

答案1

答案2

正确的运算符使用

在Mapview中一直添加多边形的名称？

PySpark 3高阶函数用于提取到列中

根据其他列中的True/False 如何创建新列？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。