将两列的信息合并在一起

huangapple go评论118阅读模式
英文:

R: Merge information from 2 columns together

问题

我已创建了一个示例数据框,其中包含3个不同的组,每个组有2列。

Group_1显示参与者的总数,Group_1_Pos显示总参与者中有多少人是积极的,依此类推:

  1. df1 <- structure(list(Date = c("2016", "2017", "2018", "2019"),
  2. Group_1 = c("100", "200", "300", "400"),
  3. Group_1_Pos = c("10", "20", "30", "40"),
  4. Group_2 = c("500", "600", "700", "800"),
  5. Group_2_Pos = c("50", "60", "70", "80"),
  6. Group_3 = c("900", "1000", "1100", "1200"),
  7. Group_3_Pos = c("90", "100", "110", "120")),
  8. class = "data.frame", row.names=c("1", "2", "3", "4"))
  9. > df1
  10. Date Group_1 Group_1_Pos Group_2 Group_2_Pos Group_3 Group_3_Pos
  11. 1 2016 100 10 500 50 900 90
  12. 2 2017 200 20 600 60 1000 100
  13. 3 2018 300 30 700 70 1100 110
  14. 4 2019 400 40 800 80 1200 120

我想将总参与者列与积极参与者列合并,以保持两个值仍然用括号分开。例如:

  1. Date Group_1 Group_2 Group_3
  2. 1 2016 100 (10) 500 (50) 900 (90)
  3. 2 2017 200 (20) 600 (60) 1000 (100)
  4. 3 2018 300 (30) 700 (70) 1100 (110)
  5. 4 2019 400 (40) 800 (80) 1200 (120)

因此,在这个示例中,我将积极的参与者添加到总参与者旁边,并仅保留3列用于3个组。

英文:

I have created an example dataframe which has 3 different groups with 2 columns for each group.

Group_1 shows the total amount of participants and Group_1_Pos shows how many of the total participants are positive, etc:

  1. df1 <- structure(list(Date = c("2016", "2017", "2018", "2019"),
  2. Group_1 = c("100", "200", "300", "400"),
  3. Group_1_Pos = c("10", "20", "30", "40"),
  4. Group_2 = c("500", "600", "700", "800"),
  5. Group_2_Pos = c("50", "60", "70", "80"),
  6. Group_3 = c("900", "1000", "1100", "1200"),
  7. Group_3_Pos = c("90", "100", "110", "120")),
  8. class = "data.frame", row.names=c("1", "2", "3", "4"))
  9. > df1
  10. Date Group_1 Group_1_Pos Group_2 Group_2_Pos Group_3 Group_3_Pos
  11. 1 2016 100 10 500 50 900 90
  12. 2 2017 200 20 600 60 1000 100
  13. 3 2018 300 30 700 70 1100 110
  14. 4 2019 400 40 800 80 1200 120

I would like to combine the total participant columns together with the positive participant columns in a way that keeps both values still seperated with brackets. As an example:

  1. Date Group_1 Group_2 Group_3
  2. 1 2016 100 (10) 500 (50) 900 (90)
  3. 2 2017 200 (20) 600 (60) 1000 (100)
  4. 3 2018 300 (30) 700 (70) 1100 (110)
  5. 4 2019 400 (40) 800 (80) 1200 (120)

So in this example I add the positive participants in () brackets next to the total participants and only keep 3 columns for the 3 groups.

Any help would be appreciated.

答案1

得分: 1

使用 dplyr,您可以尝试以下方式:

  1. library(dplyr)
  2. df1 %>%
  3. mutate(Group_1 = paste0(Group_1, " (", Group_1_Pos, ")"),
  4. Group_2 = paste0(Group_2, " (", Group_2_Pos, ")"),
  5. Group_3 = paste0(Group_3, " (", Group_3_Pos, ")"),) %>%
  6. select(-contains("Pos"))
  7. # Date Group_1 Group_2 Group_3
  8. # 1 2016 100 (10) 500 (50) 900 (90)
  9. # 2 2017 200 (20) 600 (60) 1000 (100)
  10. # 3 2018 300 (30) 700 (70) 1100 (110)
  11. # 4 2019 400 (40) 800 (80) 1200 (120)
英文:

Using dplyr you could go for something like:

  1. library(dplyr)
  2. df1 %>%
  3. mutate(Group_1 = paste0(Group_1, " (", Group_1_Pos, ")"),
  4. Group_2 = paste0(Group_2, " (", Group_2_Pos, ")"),
  5. Group_3 = paste0(Group_3, " (", Group_3_Pos, ")"),) %>%
  6. select(-contains("Pos"))
  7. # Date Group_1 Group_2 Group_3
  8. # 1 2016 100 (10) 500 (50) 900 (90)
  9. # 2 2017 200 (20) 600 (60) 1000 (100)
  10. # 3 2018 300 (30) 700 (70) 1100 (110)
  11. # 4 2019 400 (40) 800 (80) 1200 (120)

答案2

得分: 1

A purrr-dplyr-stringr:

  1. other_values <- df1[, seq(1, ncol(df1), 2)]
  2. df1 %>%
  3. select(-contains("Pos")) %>%
  4. purrr::map2_df(., other_values, function(x, y) paste0(x, " (", y, ")")) %>%
  5. mutate(Date = stringr::str_remove_all(Date, "\\s.*"))

A tibble: 4 x 4

Date Group_1 Group_2 Group_3

1 2016 100 (10) 500 (50) 900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)

  1. <details>
  2. <summary>英文:</summary>
  3. A `purrr`-`dplyr`-`stringr`:
  4. other_values &lt;- df1[,seq(1,ncol(df1),2)]
  5. df1 %&gt;%
  6. select(-contains(&quot;Pos&quot;)) %&gt;%
  7. purrr::map2_df(.,other_values,
  8. function(x,y) paste0(x,&quot; (&quot;,y,&quot;)&quot;)) %&gt;%
  9. mutate(Date=stringr::str_remove_all(Date,&quot;\\s.*&quot;))
  10. # A tibble: 4 x 4
  11. Date Group_1 Group_2 Group_3
  12. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  13. 1 2016 100 (10) 500 (50) 900 (90)
  14. 2 2017 200 (20) 600 (60) 1000 (100)
  15. 3 2018 300 (30) 700 (70) 1100 (110)
  16. 4 2019 400 (40) 800 (80) 1200 (120)
  17. </details>
  18. # 答案3
  19. **得分**: 1
  20. 以下是翻译好的代码部分:
  21. ```R
  22. 这是一个使用基本R方式来完成问题要求的方法。
  23. 使用正则表达式和`grep`获取要粘贴的列,然后遍历索引向量并将它们粘贴在一起。最后,使用`cbind`将第一列和这个结果合并。
  24. inx <- grep("\\d$", names(df1))
  25. tmp <- sapply(inx, function(i) paste(df1[[i]], paste0("(", df1[[i + 1]], ")")))
  26. res <- cbind(df1[1], tmp)
  27. names(res)[-1] <- names(df1)[inx]
  28. res
  29. # Date Group_1 Group_2 Group_3
  30. #1 2016 100 (10) 500 (50) 900 (90)
  31. #2 2017 200 (20) 600 (60) 1000 (100)
  32. #3 2018 300 (30) 700 (70) 1100 (110)
  33. #4 2019 400 (40) 800 (80) 1200 (120)
  34. 最后清理。
  35. rm(inx, tmp)
英文:

Here is a base R way of doing what the question asks for.
Get the columns to be pasted with a regex and grep, then loop through the indices vector and paste them together. Finally, cbind the first column and this result.

  1. inx &lt;- grep(&quot;\\d$&quot;, names(df1))
  2. tmp &lt;- sapply(inx, function(i) paste(df1[[i]], paste0(&quot;(&quot;, df1[[i + 1]], &quot;)&quot;)))
  3. res &lt;- cbind(df1[1], tmp)
  4. names(res)[-1] &lt;- names(df1)[inx]
  5. res
  6. # Date Group_1 Group_2 Group_3
  7. #1 2016 100 (10) 500 (50) 900 (90)
  8. #2 2017 200 (20) 600 (60) 1000 (100)
  9. #3 2018 300 (30) 700 (70) 1100 (110)
  10. #4 2019 400 (40) 800 (80) 1200 (120)

Final clean up.

  1. rm(inx, tmp)

答案4

得分: 1

给定3个组,这是一个基于R语言的解决方案,可以为您提供所需的输出:

  1. n <- 3
  2. dfout <- cbind(df1[1],
  3. `colnames<-`(sapply(seq(n), function(k) paste0(df[[x <- paste0("Group_",k)]], " (", df[[paste0(x,"_Pos")]], ")")),
  4. paste0("Group", seq(n))))

结果如下:

  1. > dfout
  2. Date Group1 Group2 Group3
  3. 1 2016 100 (10) 500 (50) 900 (90)
  4. 2 2017 200 (20) 600 (60) 1000 (100)
  5. 3 2018 300 (30) 700 (70) 1100 (110)
  6. 4 2019 400 (40) 800 (80) 1200 (120)

如果您有任何其他问题,请随时提出。

英文:

Given 3 groups, here is a base R solution that can give you the desired output

  1. n &lt;- 3
  2. dfout &lt;- cbind(df1[1],
  3. `colnames&lt;-`(sapply(seq(n), function(k) paste0(df[[x &lt;- paste0(&quot;Group_&quot;,k)]],&quot; (&quot;, df[[paste0(x,&quot;_Pos&quot;)]],&quot;)&quot;)),
  4. paste0(&quot;Group&quot;,seq(n))))

such that

  1. &gt; dfout
  2. Date Group1 Group2 Group3
  3. 1 2016 100 (10) 500 (50) 900 (90)
  4. 2 2017 200 (20) 600 (60) 1000 (100)
  5. 3 2018 300 (30) 700 (70) 1100 (110)
  6. 4 2019 400 (40) 800 (80) 1200 (120)

答案5

得分: 0

这是一个更通用的tidyverse解决方案

  1. library(tidyverse)
  2. df1 %>%
  3. rename_at(
  4. vars(contains("Pos")),
  5. ~ str_remove(., "_Pos") %>%
  6. str_remove("Group_") %>%
  7. str_c("Pos", ., sep = "_")
  8. ) %>%
  9. pivot_longer(Group_1:Pos_3,
  10. names_to = c(".value", "set"),
  11. names_sep = "_") %>%
  12. mutate(Pos = Pos %>%
  13. str_c("(", ., ")")) %>%
  14. unite("result", Group:Pos, sep = "") %>%
  15. pivot_wider(names_from = set, values_from = result)

请注意,这是R代码的翻译。

英文:

Here is a more general tidyverse solution

  1. library(tidyverse)
  2. df1 %&gt;%
  3. rename_at(
  4. vars(contains(&quot;Pos&quot;)),
  5. ~ str_remove(., &quot;_Pos&quot;) %&gt;% str_remove(&quot;Group_&quot;) %&gt;% str_c(&quot;Pos&quot;, ., sep = &quot;_&quot;)
  6. ) %&gt;%
  7. pivot_longer(Group_1:Pos_3,
  8. names_to = c(&quot;.value&quot;, &quot;set&quot;),
  9. names_sep = &quot;_&quot;) %&gt;%
  10. mutate(Pos = Pos %&gt;% str_c(&quot;(&quot;, ., &quot;)&quot;)) %&gt;%
  11. unite(&quot;result&quot;, Group:Pos, sep = &quot;&quot;) %&gt;%
  12. pivot_wider(names_from = set, values_from = result)

huangapple
  • 本文由 发表于 2020年1月3日 19:58:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/59578238.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定