2023年4月13日 20:06:38go评论98阅读模式

英文:

I am looking for a shorter function to group similar datasets from a list

问题

我有一个分散的数据库列表。我想要将相同年份的数据库分组；

mydata12<-data.frame(Age=c(12,13),Sex=c("F","H"), Weight=c(70,75),year=c(2012))
mydata13<-data.frame(Age=c(14,15),Sex=c("F","H"), Weight=c(70,75),year=c(2013))
mydata14<-data.frame(Age=c(16,17),Sex=c("F","H"), Weight=c(70,75),year=c(2014))
mydata2012<-data.frame(Age=c(18,19),Sex=c("F","H"), Weight=c(70,75),year=c(2012))
mydata2013<-data.frame(Age=c(20,13),Sex=c("H","H"), Weight=c(70,75),year=c(2013))
mydata2014<-data.frame(Age=c(22,13),Sex=c("F","F"), Weight=c(70,75),year=c(2014))
mydatalist<-list(
  `12`=mydata12,
  `13`=mydata13,
  `14`=mydata14,
  `2013`=mydata2012,
  `2014`=mydata2013,
  `2015`=mydata2014
)

你可以使用以下函数来完成：

list(`2012`=rbind(mydatalist$`12`,mydata2012),
     `2013`=rbind(mydatalist$`13`,mydata2013),
     `2014`=rbind(mydatalist$`14`,mydata2014))

但我想要让它更简洁（不需要为每一年编写一行代码），因为我们已经有模式 2012:2021，可以这样做：

12:21

英文:

I have scattered databases in a list. I would like to group databases of the same year;

mydata12&lt;-data.frame(Age=c(12,13),Sex=c(&quot;F&quot;,&quot;H&quot;), Weight=c(70,75),year=c(2012))
mydata13&lt;-data.frame(Age=c(14,15),Sex=c(&quot;F&quot;,&quot;H&quot;), Weight=c(70,75),year=c(2013))
mydata14&lt;-data.frame(Age=c(16,17),Sex=c(&quot;F&quot;,&quot;H&quot;), Weight=c(70,75),year=c(2014))
mydata2012&lt;-data.frame(Age=c(18,19),Sex=c(&quot;F&quot;,&quot;H&quot;), Weight=c(70,75),year=c(2012))
mydata2013&lt;-data.frame(Age=c(20,13),Sex=c(&quot;H&quot;,&quot;H&quot;), Weight=c(70,75),year=c(2013))
mydata2014&lt;-data.frame(Age=c(22,13),Sex=c(&quot;F&quot;,&quot;F&quot;), Weight=c(70,75),year=c(2014))
 mydatalist&lt;-list(
  `12`=mydata12,
  `13`=mydata13,
  `14`=mydata14,
  `2013`=mydata2012,
  `2014`=mydata2013,
  `2015`=mydata2014
)

I can do it with this function

list(`2012`=rbind(mydatalist$`12`,mydata2012),
     `2013`=rbind(mydatalist$`13`,mydata2013),
     `2014`=rbind(mydatalist$`14`,mydata2014))

but I would like to make it shorter (without a line of code for each year), since we already have patterns 2012:2021,

12:21

答案1

得分: 3

We could use bind_rows with group_split:

library(dplyr)
bind_rows(mydatalist) %>%
  split(f = as.factor(.$year))

$`2012`
  Age Sex Weight year
1  12   F     70 2012
2  13   H     75 2012
7  18   F     70 2012
8  19   H     75 2012
$`2013`
   Age Sex Weight year
3   14   F     70 2013
4   15   H     75 2013
9   20   H     70 2013
10  13   H     75 2013
$`2014`
   Age Sex Weight year
5   16   F     70 2014
6   17   H     75 2014
11  22   F     70 2014
12  13   F     75 2014

英文:

We could use bind_rows with group_split:

library(dplyr)
bind_rows(mydatalist) %&gt;% 
  split(f = as.factor(.$year))

$`2012`
  Age Sex Weight year
1  12   F     70 2012
2  13   H     75 2012
7  18   F     70 2012
8  19   H     75 2012
$`2013`
   Age Sex Weight year
3   14   F     70 2013
4   15   H     75 2013
9   20   H     70 2013
10  13   H     75 2013
$`2014`
   Age Sex Weight year
5   16   F     70 2014
6   17   H     75 2014
11  22   F     70 2014
12  13   F     75 2014

答案2

得分: 3

In base R, 使用以下方式之一来处理列表名称的子串，即最后两位数字，或者在列表名称仅包含2个字符的情况下添加20作为前缀，然后执行split和rbind：

out <- lapply(split(mydatalist, sub("^(\\d{2})$", "20\", names(mydatalist))), \(x) `row.names<-`(do.call(rbind, x), NULL))

-output

> out
$`2012`
  Age Sex Weight year
1  12   F     70 2012
2  13   H     75 2012
3  18   F     70 2012
4  19   H     75 2012
$`2013`
  Age Sex Weight year
1  14   F     70 2013
2  15   H     75 2013
3  20   H     70 2013
4  13   H     75 2013
$`2014`
  Age Sex Weight year
1  16   F     70 2014
2  17   H     75 2014
3  22   F     70 2014
4  13   F     75 2014

英文:

In base R, use either the substring of the list names i.e. the last 2 digits or add 20 as prefix to those have only 2 characters in the list name, then split the list and rbind

out &lt;- lapply(split(mydatalist, sub(&quot;^(\\d{2})$&quot;, &quot;20\&quot;, 
  names(mydatalist))), \(x) `row.names&lt;-`(do.call(rbind, x), NULL))

-output

&gt; out
$`2012`
  Age Sex Weight year
1  12   F     70 2012
2  13   H     75 2012
3  18   F     70 2012
4  19   H     75 2012
$`2013`
  Age Sex Weight year
1  14   F     70 2013
2  15   H     75 2013
3  20   H     70 2013
4  13   H     75 2013
$`2014`
  Age Sex Weight year
1  16   F     70 2014
2  17   H     75 2014
3  22   F     70 2014
4  13   F     75 2014
</details>
# 答案3
**得分**: 3
以下是翻译好的内容：

> split(row.names<-(do.call(rbind, mydatalist), NULL), ~year)
$2012
Age Sex Weight year
1 12 F 70 2012
2 13 H 75 2012
7 18 F 70 2012
8 19 H 75 2012

$2013
Age Sex Weight year
3 14 F 70 2013
4 15 H 75 2013
9 20 H 70 2013
10 13 H 75 2013

$2014
Age Sex Weight year
5 16 F 70 2014
6 17 H 75 2014
11 22 F 70 2014
12 13 F 75 2014


<details>
<summary>英文:</summary>
I am not sure how &quot;short&quot; you are happy with, but below might be an option

> split(row.names<-(do.call(rbind, mydatalist), NULL), ~year)
$2012
Age Sex Weight year
1 12 F 70 2012
2 13 H 75 2012
7 18 F 70 2012
8 19 H 75 2012

$2013
Age Sex Weight year
3 14 F 70 2013
4 15 H 75 2013
9 20 H 70 2013
10 13 H 75 2013

$2014
Age Sex Weight year
5 16 F 70 2014
6 17 H 75 2014
11 22 F 70 2014
12 13 F 75 2014


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

我正在寻找一个更短的函数来从列表中分组相似的数据集。

问题

答案1

答案2

使用`substr`函数在R中获取前六个字符时为什么会得到小数？

Using outside function in dplyr to standardize values via selected geometric mean. (Getting it via sample instead of geom mean of full column)

创建新变量，基于组中其他变量的结果 – R

对一个对象列表进行排序，使用另一个对象的方法。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。