英文:
R: Merge information from 2 columns together
问题
我已创建了一个示例数据框,其中包含3个不同的组,每个组有2列。
Group_1显示参与者的总数,Group_1_Pos显示总参与者中有多少人是积极的,依此类推:
df1 <- structure(list(Date = c("2016", "2017", "2018", "2019"),
Group_1 = c("100", "200", "300", "400"),
Group_1_Pos = c("10", "20", "30", "40"),
Group_2 = c("500", "600", "700", "800"),
Group_2_Pos = c("50", "60", "70", "80"),
Group_3 = c("900", "1000", "1100", "1200"),
Group_3_Pos = c("90", "100", "110", "120")),
class = "data.frame", row.names=c("1", "2", "3", "4"))
> df1
Date Group_1 Group_1_Pos Group_2 Group_2_Pos Group_3 Group_3_Pos
1 2016 100 10 500 50 900 90
2 2017 200 20 600 60 1000 100
3 2018 300 30 700 70 1100 110
4 2019 400 40 800 80 1200 120
我想将总参与者列与积极参与者列合并,以保持两个值仍然用括号分开。例如:
Date Group_1 Group_2 Group_3
1 2016 100 (10) 500 (50) 900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)
因此,在这个示例中,我将积极的参与者添加到总参与者旁边,并仅保留3列用于3个组。
英文:
I have created an example dataframe which has 3 different groups with 2 columns for each group.
Group_1 shows the total amount of participants and Group_1_Pos shows how many of the total participants are positive, etc:
df1 <- structure(list(Date = c("2016", "2017", "2018", "2019"),
Group_1 = c("100", "200", "300", "400"),
Group_1_Pos = c("10", "20", "30", "40"),
Group_2 = c("500", "600", "700", "800"),
Group_2_Pos = c("50", "60", "70", "80"),
Group_3 = c("900", "1000", "1100", "1200"),
Group_3_Pos = c("90", "100", "110", "120")),
class = "data.frame", row.names=c("1", "2", "3", "4"))
> df1
Date Group_1 Group_1_Pos Group_2 Group_2_Pos Group_3 Group_3_Pos
1 2016 100 10 500 50 900 90
2 2017 200 20 600 60 1000 100
3 2018 300 30 700 70 1100 110
4 2019 400 40 800 80 1200 120
I would like to combine the total participant columns together with the positive participant columns in a way that keeps both values still seperated with brackets. As an example:
Date Group_1 Group_2 Group_3
1 2016 100 (10) 500 (50) 900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)
So in this example I add the positive participants in () brackets next to the total participants and only keep 3 columns for the 3 groups.
Any help would be appreciated.
答案1
得分: 1
使用 dplyr
,您可以尝试以下方式:
library(dplyr)
df1 %>%
mutate(Group_1 = paste0(Group_1, " (", Group_1_Pos, ")"),
Group_2 = paste0(Group_2, " (", Group_2_Pos, ")"),
Group_3 = paste0(Group_3, " (", Group_3_Pos, ")"),) %>%
select(-contains("Pos"))
# Date Group_1 Group_2 Group_3
# 1 2016 100 (10) 500 (50) 900 (90)
# 2 2017 200 (20) 600 (60) 1000 (100)
# 3 2018 300 (30) 700 (70) 1100 (110)
# 4 2019 400 (40) 800 (80) 1200 (120)
英文:
Using dplyr
you could go for something like:
library(dplyr)
df1 %>%
mutate(Group_1 = paste0(Group_1, " (", Group_1_Pos, ")"),
Group_2 = paste0(Group_2, " (", Group_2_Pos, ")"),
Group_3 = paste0(Group_3, " (", Group_3_Pos, ")"),) %>%
select(-contains("Pos"))
# Date Group_1 Group_2 Group_3
# 1 2016 100 (10) 500 (50) 900 (90)
# 2 2017 200 (20) 600 (60) 1000 (100)
# 3 2018 300 (30) 700 (70) 1100 (110)
# 4 2019 400 (40) 800 (80) 1200 (120)
答案2
得分: 1
A purrr
-dplyr
-stringr
:
other_values <- df1[, seq(1, ncol(df1), 2)]
df1 %>%
select(-contains("Pos")) %>%
purrr::map2_df(., other_values, function(x, y) paste0(x, " (", y, ")")) %>%
mutate(Date = stringr::str_remove_all(Date, "\\s.*"))
A tibble: 4 x 4
Date Group_1 Group_2 Group_3
1 2016 100 (10) 500 (50) 900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)
<details>
<summary>英文:</summary>
A `purrr`-`dplyr`-`stringr`:
other_values <- df1[,seq(1,ncol(df1),2)]
df1 %>%
select(-contains("Pos")) %>%
purrr::map2_df(.,other_values,
function(x,y) paste0(x," (",y,")")) %>%
mutate(Date=stringr::str_remove_all(Date,"\\s.*"))
# A tibble: 4 x 4
Date Group_1 Group_2 Group_3
<chr> <chr> <chr> <chr>
1 2016 100 (10) 500 (50) 900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)
</details>
# 答案3
**得分**: 1
以下是翻译好的代码部分:
```R
这是一个使用基本R方式来完成问题要求的方法。
使用正则表达式和`grep`获取要粘贴的列,然后遍历索引向量并将它们粘贴在一起。最后,使用`cbind`将第一列和这个结果合并。
inx <- grep("\\d$", names(df1))
tmp <- sapply(inx, function(i) paste(df1[[i]], paste0("(", df1[[i + 1]], ")")))
res <- cbind(df1[1], tmp)
names(res)[-1] <- names(df1)[inx]
res
# Date Group_1 Group_2 Group_3
#1 2016 100 (10) 500 (50) 900 (90)
#2 2017 200 (20) 600 (60) 1000 (100)
#3 2018 300 (30) 700 (70) 1100 (110)
#4 2019 400 (40) 800 (80) 1200 (120)
最后清理。
rm(inx, tmp)
英文:
Here is a base R way of doing what the question asks for.
Get the columns to be pasted with a regex and grep
, then loop through the indices vector and paste them together. Finally, cbind
the first column and this result.
inx <- grep("\\d$", names(df1))
tmp <- sapply(inx, function(i) paste(df1[[i]], paste0("(", df1[[i + 1]], ")")))
res <- cbind(df1[1], tmp)
names(res)[-1] <- names(df1)[inx]
res
# Date Group_1 Group_2 Group_3
#1 2016 100 (10) 500 (50) 900 (90)
#2 2017 200 (20) 600 (60) 1000 (100)
#3 2018 300 (30) 700 (70) 1100 (110)
#4 2019 400 (40) 800 (80) 1200 (120)
Final clean up.
rm(inx, tmp)
答案4
得分: 1
给定3个组,这是一个基于R语言的解决方案,可以为您提供所需的输出:
n <- 3
dfout <- cbind(df1[1],
`colnames<-`(sapply(seq(n), function(k) paste0(df[[x <- paste0("Group_",k)]], " (", df[[paste0(x,"_Pos")]], ")")),
paste0("Group", seq(n))))
结果如下:
> dfout
Date Group1 Group2 Group3
1 2016 100 (10) 500 (50) 900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)
如果您有任何其他问题,请随时提出。
英文:
Given 3 groups, here is a base R solution that can give you the desired output
n <- 3
dfout <- cbind(df1[1],
`colnames<-`(sapply(seq(n), function(k) paste0(df[[x <- paste0("Group_",k)]]," (", df[[paste0(x,"_Pos")]],")")),
paste0("Group",seq(n))))
such that
> dfout
Date Group1 Group2 Group3
1 2016 100 (10) 500 (50) 900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)
答案5
得分: 0
这是一个更通用的tidyverse解决方案
library(tidyverse)
df1 %>%
rename_at(
vars(contains("Pos")),
~ str_remove(., "_Pos") %>%
str_remove("Group_") %>%
str_c("Pos", ., sep = "_")
) %>%
pivot_longer(Group_1:Pos_3,
names_to = c(".value", "set"),
names_sep = "_") %>%
mutate(Pos = Pos %>%
str_c("(", ., ")")) %>%
unite("result", Group:Pos, sep = "") %>%
pivot_wider(names_from = set, values_from = result)
请注意,这是R代码的翻译。
英文:
Here is a more general tidyverse solution
library(tidyverse)
df1 %>%
rename_at(
vars(contains("Pos")),
~ str_remove(., "_Pos") %>% str_remove("Group_") %>% str_c("Pos", ., sep = "_")
) %>%
pivot_longer(Group_1:Pos_3,
names_to = c(".value", "set"),
names_sep = "_") %>%
mutate(Pos = Pos %>% str_c("(", ., ")")) %>%
unite("result", Group:Pos, sep = "") %>%
pivot_wider(names_from = set, values_from = result)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论