将两列的信息合并在一起

huangapple go评论88阅读模式
英文:

R: Merge information from 2 columns together

问题

我已创建了一个示例数据框,其中包含3个不同的组,每个组有2列。

Group_1显示参与者的总数,Group_1_Pos显示总参与者中有多少人是积极的,依此类推:

df1 <- structure(list(Date = c("2016", "2017", "2018", "2019"), 
                       Group_1 = c("100", "200", "300", "400"), 
                       Group_1_Pos = c("10", "20", "30", "40"),
                       Group_2 = c("500", "600", "700", "800"),
                       Group_2_Pos = c("50", "60", "70", "80"), 
                       Group_3 = c("900", "1000", "1100", "1200"),
                       Group_3_Pos = c("90", "100", "110", "120")), 
                  class = "data.frame", row.names=c("1", "2", "3", "4"))
> df1
  Date Group_1 Group_1_Pos Group_2 Group_2_Pos Group_3 Group_3_Pos
1 2016     100          10     500          50     900          90
2 2017     200          20     600          60    1000         100
3 2018     300          30     700          70    1100         110
4 2019     400          40     800          80    1200         120

我想将总参与者列与积极参与者列合并,以保持两个值仍然用括号分开。例如:

  Date      Group_1    Group_2     Group_3 
1 2016     100 (10)    500 (50)    900 (90)          
2 2017     200 (20)    600 (60)  1000 (100)        
3 2018     300 (30)    700 (70)  1100 (110)       
4 2019     400 (40)    800 (80)  1200 (120)        

因此,在这个示例中,我将积极的参与者添加到总参与者旁边,并仅保留3列用于3个组。

英文:

I have created an example dataframe which has 3 different groups with 2 columns for each group.

Group_1 shows the total amount of participants and Group_1_Pos shows how many of the total participants are positive, etc:

df1 <- structure(list(Date = c("2016", "2017", "2018", "2019"), 
                       Group_1 = c("100", "200", "300", "400"), 
                       Group_1_Pos = c("10", "20", "30", "40"),
                       Group_2 = c("500", "600", "700", "800"),
                       Group_2_Pos = c("50", "60", "70", "80"), 
                       Group_3 = c("900", "1000", "1100", "1200"),
                       Group_3_Pos = c("90", "100", "110", "120")), 
                  class = "data.frame", row.names=c("1", "2", "3", "4"))
> df1
  Date Group_1 Group_1_Pos Group_2 Group_2_Pos Group_3 Group_3_Pos
1 2016     100          10     500          50     900          90
2 2017     200          20     600          60    1000         100
3 2018     300          30     700          70    1100         110
4 2019     400          40     800          80    1200         120

I would like to combine the total participant columns together with the positive participant columns in a way that keeps both values still seperated with brackets. As an example:

  Date      Group_1    Group_2     Group_3 
1 2016     100 (10)    500 (50)    900 (90)          
2 2017     200 (20)    600 (60)  1000 (100)        
3 2018     300 (30)    700 (70)  1100 (110)       
4 2019     400 (40)    800 (80)  1200 (120)        

So in this example I add the positive participants in () brackets next to the total participants and only keep 3 columns for the 3 groups.

Any help would be appreciated.

答案1

得分: 1

使用 dplyr,您可以尝试以下方式:

library(dplyr)

df1 %>%
  mutate(Group_1 = paste0(Group_1, " (", Group_1_Pos, ")"),
         Group_2 = paste0(Group_2, " (", Group_2_Pos, ")"),
         Group_3 = paste0(Group_3, " (", Group_3_Pos, ")"),) %>%
  select(-contains("Pos"))

#   Date  Group_1  Group_2    Group_3
# 1 2016 100 (10) 500 (50)   900 (90)
# 2 2017 200 (20) 600 (60) 1000 (100)
# 3 2018 300 (30) 700 (70) 1100 (110)
# 4 2019 400 (40) 800 (80) 1200 (120)

英文:

Using dplyr you could go for something like:

library(dplyr)

df1 %>%
  mutate(Group_1 = paste0(Group_1, " (", Group_1_Pos, ")"),
         Group_2 = paste0(Group_2, " (", Group_2_Pos, ")"),
         Group_3 = paste0(Group_3, " (", Group_3_Pos, ")"),) %>% 
  select(-contains("Pos"))

#   Date  Group_1  Group_2    Group_3
# 1 2016 100 (10) 500 (50)   900 (90)
# 2 2017 200 (20) 600 (60) 1000 (100)
# 3 2018 300 (30) 700 (70) 1100 (110)
# 4 2019 400 (40) 800 (80) 1200 (120)

答案2

得分: 1

A purrr-dplyr-stringr:

other_values <- df1[, seq(1, ncol(df1), 2)]
df1 %>%
  select(-contains("Pos")) %>%
  purrr::map2_df(., other_values, function(x, y) paste0(x, " (", y, ")")) %>%
  mutate(Date = stringr::str_remove_all(Date, "\\s.*"))

A tibble: 4 x 4

Date Group_1 Group_2 Group_3

1 2016 100 (10) 500 (50) 900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)


<details>
<summary>英文:</summary>

A `purrr`-`dplyr`-`stringr`:

        other_values &lt;- df1[,seq(1,ncol(df1),2)]
           df1 %&gt;% 
       select(-contains(&quot;Pos&quot;)) %&gt;% 
       purrr::map2_df(.,other_values, 
                      function(x,y) paste0(x,&quot; (&quot;,y,&quot;)&quot;)) %&gt;% 
     
       mutate(Date=stringr::str_remove_all(Date,&quot;\\s.*&quot;))
    # A tibble: 4 x 4
      Date  Group_1  Group_2  Group_3   
      &lt;chr&gt; &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;     
    1 2016  100 (10) 500 (50) 900 (90)  
    2 2017  200 (20) 600 (60) 1000 (100)
    3 2018  300 (30) 700 (70) 1100 (110)
    4 2019  400 (40) 800 (80) 1200 (120)

</details>



# 答案3
**得分**: 1

以下是翻译好的代码部分:

```R
这是一个使用基本R方式来完成问题要求的方法。
使用正则表达式和`grep`获取要粘贴的列,然后遍历索引向量并将它们粘贴在一起。最后,使用`cbind`将第一列和这个结果合并。

inx <- grep("\\d$", names(df1))
tmp <- sapply(inx, function(i) paste(df1[[i]], paste0("(", df1[[i + 1]], ")")))
res <- cbind(df1[1], tmp)
names(res)[-1] <- names(df1)[inx]

res
#  Date  Group_1  Group_2    Group_3
#1 2016 100 (10) 500 (50)   900 (90)
#2 2017 200 (20) 600 (60) 1000 (100)
#3 2018 300 (30) 700 (70) 1100 (110)
#4 2019 400 (40) 800 (80) 1200 (120)

最后清理。

rm(inx, tmp)
英文:

Here is a base R way of doing what the question asks for.
Get the columns to be pasted with a regex and grep, then loop through the indices vector and paste them together. Finally, cbind the first column and this result.

inx &lt;- grep(&quot;\\d$&quot;, names(df1))
tmp &lt;- sapply(inx, function(i) paste(df1[[i]], paste0(&quot;(&quot;, df1[[i + 1]], &quot;)&quot;)))
res &lt;- cbind(df1[1], tmp)
names(res)[-1] &lt;- names(df1)[inx]

res
#  Date  Group_1  Group_2    Group_3
#1 2016 100 (10) 500 (50)   900 (90)
#2 2017 200 (20) 600 (60) 1000 (100)
#3 2018 300 (30) 700 (70) 1100 (110)
#4 2019 400 (40) 800 (80) 1200 (120)

Final clean up.

rm(inx, tmp)

答案4

得分: 1

给定3个组,这是一个基于R语言的解决方案,可以为您提供所需的输出:

n <- 3
dfout <- cbind(df1[1],
               `colnames<-`(sapply(seq(n), function(k) paste0(df[[x <- paste0("Group_",k)]], " (", df[[paste0(x,"_Pos")]], ")")),
                            paste0("Group", seq(n))))

结果如下:

> dfout
  Date   Group1   Group2     Group3
1 2016 100 (10) 500 (50)   900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)

如果您有任何其他问题,请随时提出。

英文:

Given 3 groups, here is a base R solution that can give you the desired output

n &lt;- 3
dfout &lt;- cbind(df1[1],
               `colnames&lt;-`(sapply(seq(n), function(k) paste0(df[[x &lt;- paste0(&quot;Group_&quot;,k)]],&quot; (&quot;, df[[paste0(x,&quot;_Pos&quot;)]],&quot;)&quot;)),
                            paste0(&quot;Group&quot;,seq(n))))

such that

&gt; dfout
  Date   Group1   Group2     Group3
1 2016 100 (10) 500 (50)   900 (90)
2 2017 200 (20) 600 (60) 1000 (100)
3 2018 300 (30) 700 (70) 1100 (110)
4 2019 400 (40) 800 (80) 1200 (120)

答案5

得分: 0

这是一个更通用的tidyverse解决方案

library(tidyverse)
df1 %>%
  rename_at(
    vars(contains("Pos")),
    ~ str_remove(., "_Pos") %>%
    str_remove("Group_") %>%
    str_c("Pos", ., sep = "_")
  ) %>%
  pivot_longer(Group_1:Pos_3,
               names_to = c(".value", "set"),
               names_sep = "_") %>%
  mutate(Pos = Pos %>%
           str_c("(", ., ")")) %>%
  unite("result", Group:Pos, sep = "") %>%
  pivot_wider(names_from = set, values_from = result)

请注意,这是R代码的翻译。

英文:

Here is a more general tidyverse solution

library(tidyverse)
df1 %&gt;%
      rename_at(
        vars(contains(&quot;Pos&quot;)),
        ~ str_remove(., &quot;_Pos&quot;) %&gt;% str_remove(&quot;Group_&quot;) %&gt;% str_c(&quot;Pos&quot;, ., sep = &quot;_&quot;)
      ) %&gt;%
      pivot_longer(Group_1:Pos_3,
                   names_to = c(&quot;.value&quot;, &quot;set&quot;),
                   names_sep = &quot;_&quot;) %&gt;%
      mutate(Pos = Pos %&gt;% str_c(&quot;(&quot;, ., &quot;)&quot;)) %&gt;%
      unite(&quot;result&quot;, Group:Pos, sep = &quot;&quot;) %&gt;%
      pivot_wider(names_from = set, values_from = result)

huangapple
  • 本文由 发表于 2020年1月3日 19:58:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/59578238.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定