英文:
I am looking for a shorter function to group similar datasets from a list
问题
我有一个分散的数据库列表。我想要将相同年份的数据库分组;
mydata12<-data.frame(Age=c(12,13),Sex=c("F","H"), Weight=c(70,75),year=c(2012))
mydata13<-data.frame(Age=c(14,15),Sex=c("F","H"), Weight=c(70,75),year=c(2013))
mydata14<-data.frame(Age=c(16,17),Sex=c("F","H"), Weight=c(70,75),year=c(2014))
mydata2012<-data.frame(Age=c(18,19),Sex=c("F","H"), Weight=c(70,75),year=c(2012))
mydata2013<-data.frame(Age=c(20,13),Sex=c("H","H"), Weight=c(70,75),year=c(2013))
mydata2014<-data.frame(Age=c(22,13),Sex=c("F","F"), Weight=c(70,75),year=c(2014))
mydatalist<-list(
`12`=mydata12,
`13`=mydata13,
`14`=mydata14,
`2013`=mydata2012,
`2014`=mydata2013,
`2015`=mydata2014
)
你可以使用以下函数来完成:
list(`2012`=rbind(mydatalist$`12`,mydata2012),
`2013`=rbind(mydatalist$`13`,mydata2013),
`2014`=rbind(mydatalist$`14`,mydata2014))
但我想要让它更简洁(不需要为每一年编写一行代码),因为我们已经有模式 2012:2021
,可以这样做:
12:21
英文:
I have scattered databases in a list. I would like to group databases of the same year;
mydata12<-data.frame(Age=c(12,13),Sex=c("F","H"), Weight=c(70,75),year=c(2012))
mydata13<-data.frame(Age=c(14,15),Sex=c("F","H"), Weight=c(70,75),year=c(2013))
mydata14<-data.frame(Age=c(16,17),Sex=c("F","H"), Weight=c(70,75),year=c(2014))
mydata2012<-data.frame(Age=c(18,19),Sex=c("F","H"), Weight=c(70,75),year=c(2012))
mydata2013<-data.frame(Age=c(20,13),Sex=c("H","H"), Weight=c(70,75),year=c(2013))
mydata2014<-data.frame(Age=c(22,13),Sex=c("F","F"), Weight=c(70,75),year=c(2014))
mydatalist<-list(
`12`=mydata12,
`13`=mydata13,
`14`=mydata14,
`2013`=mydata2012,
`2014`=mydata2013,
`2015`=mydata2014
)
I can do it with this function
list(`2012`=rbind(mydatalist$`12`,mydata2012),
`2013`=rbind(mydatalist$`13`,mydata2013),
`2014`=rbind(mydatalist$`14`,mydata2014))
but I would like to make it shorter (without a line of code for each year), since we already have patterns 2012:2021
,
12:21
答案1
得分: 3
We could use bind_rows
with group_split
:
library(dplyr)
bind_rows(mydatalist) %>%
split(f = as.factor(.$year))
$`2012`
Age Sex Weight year
1 12 F 70 2012
2 13 H 75 2012
7 18 F 70 2012
8 19 H 75 2012
$`2013`
Age Sex Weight year
3 14 F 70 2013
4 15 H 75 2013
9 20 H 70 2013
10 13 H 75 2013
$`2014`
Age Sex Weight year
5 16 F 70 2014
6 17 H 75 2014
11 22 F 70 2014
12 13 F 75 2014
英文:
We could use bind_rows
with group_split
:
library(dplyr)
bind_rows(mydatalist) %>%
split(f = as.factor(.$year))
$`2012`
Age Sex Weight year
1 12 F 70 2012
2 13 H 75 2012
7 18 F 70 2012
8 19 H 75 2012
$`2013`
Age Sex Weight year
3 14 F 70 2013
4 15 H 75 2013
9 20 H 70 2013
10 13 H 75 2013
$`2014`
Age Sex Weight year
5 16 F 70 2014
6 17 H 75 2014
11 22 F 70 2014
12 13 F 75 2014
答案2
得分: 3
In base R
, 使用以下方式之一来处理列表名称的子串,即最后两位数字,或者在列表名称仅包含2个字符的情况下添加20
作为前缀,然后执行split
和rbind
:
out <- lapply(split(mydatalist, sub("^(\\d{2})$", "20\", names(mydatalist))), \(x) `row.names<-`(do.call(rbind, x), NULL))
-output
> out
$`2012`
Age Sex Weight year
1 12 F 70 2012
2 13 H 75 2012
3 18 F 70 2012
4 19 H 75 2012
$`2013`
Age Sex Weight year
1 14 F 70 2013
2 15 H 75 2013
3 20 H 70 2013
4 13 H 75 2013
$`2014`
Age Sex Weight year
1 16 F 70 2014
2 17 H 75 2014
3 22 F 70 2014
4 13 F 75 2014
英文:
In base R
, use either the substring of the list names i.e. the last 2 digits or add 20
as prefix to those have only 2 characters in the list name, then split
the list
and rbind
out <- lapply(split(mydatalist, sub("^(\\d{2})$", "20\",
names(mydatalist))), \(x) `row.names<-`(do.call(rbind, x), NULL))
-output
> out
$`2012`
Age Sex Weight year
1 12 F 70 2012
2 13 H 75 2012
3 18 F 70 2012
4 19 H 75 2012
$`2013`
Age Sex Weight year
1 14 F 70 2013
2 15 H 75 2013
3 20 H 70 2013
4 13 H 75 2013
$`2014`
Age Sex Weight year
1 16 F 70 2014
2 17 H 75 2014
3 22 F 70 2014
4 13 F 75 2014
</details>
# 答案3
**得分**: 3
以下是翻译好的内容:
> split(row.names<-
(do.call(rbind, mydatalist), NULL), ~year)
$2012
Age Sex Weight year
1 12 F 70 2012
2 13 H 75 2012
7 18 F 70 2012
8 19 H 75 2012
$2013
Age Sex Weight year
3 14 F 70 2013
4 15 H 75 2013
9 20 H 70 2013
10 13 H 75 2013
$2014
Age Sex Weight year
5 16 F 70 2014
6 17 H 75 2014
11 22 F 70 2014
12 13 F 75 2014
<details>
<summary>英文:</summary>
I am not sure how "short" you are happy with, but below might be an option
> split(row.names<-
(do.call(rbind, mydatalist), NULL), ~year)
$2012
Age Sex Weight year
1 12 F 70 2012
2 13 H 75 2012
7 18 F 70 2012
8 19 H 75 2012
$2013
Age Sex Weight year
3 14 F 70 2013
4 15 H 75 2013
9 20 H 70 2013
10 13 H 75 2013
$2014
Age Sex Weight year
5 16 F 70 2014
6 17 H 75 2014
11 22 F 70 2014
12 13 F 75 2014
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论