相似列名的列求和

huangapple go评论75阅读模式
英文:

Sum of columns with similar column name

问题

我需要包含字符串“ABC DEF”的列的总和(我正在使用dplyr工作)。

df <- data.frame("aaa" = 2:5, "bbb" = 1:4, "ABC_DEF" = 1:4, "DEF" = 2:5, "ABC_DEF_GHI" = 3:6, "aaa_ABC_DEF" = 2:5)

df %>%
  mutate(ABC_DEF = rowSums(select(., contains("ABC_DEF"))))

期望的输出:

  aaa bbb ABC_DEF DEF ABC_DEF_GHI aaa_ABC_DEF ABC_DEF
1   2   1       1   2           3           2       6
2   3   2       2   3           4           3       9
3   4   3       3   4           5           4      12
4   5   4       4   5           6           5      15
英文:

I have multiple columns. Some of them contain a certain string, say "ABC DEF".

I need the sum of the columns containing this string (I'm working with dplyr).

df &lt;- data.frame(&quot;aaa&quot; = 2:5, &quot;bbb&quot; = 1:4, &quot;ABC_DEF&quot; = 1:4, &quot;DEF&quot; = 2:5, &quot;ABC_DEF_GHI&quot; = 3:6, &quot;aaa_ABC_DEF&quot; = 2:5)

  aaa bbb ABC_DEF DEF ABC_DEF_GHI aaa_ABC_DEF
1   2   1       1   2           3           2
2   3   2       2   3           4           3
3   4   3       3   4           5           4
4   5   4       4   5           6           5

I tried something like this:

df %&gt;% 
  mutate(ABC_DEF = sum(select(c(contains(&quot;ABC_DEF&quot;)))))

With this I get the error : ! contains() must be used within a selecting function.

Desired output:

 aaa bbb ABC_DEF_G DEF ABC_DEF_GHI aaa_ABC_DEF ABC_DEF
1   2   1         1   2           3           2       6
2   3   2         2   3           4           3       9
3   4   3         3   4           5           4      12
4   5   4         4   5           6           5      15

Can anyone help me how I could do it?

答案1

得分: 1

你可以使用rowwisec_across的组合来完成这个任务。

library(dplyr)

df %>% rowwise() %>% 
  mutate(ABC.DEF.1 = sum(c_across(contains("ABC.DEF")))) %>% 
  ungroup()

# 一个数据框: 4 行 x 7 列
    aaa   bbb ABC.DEF   DEF ABC.DEF.GHI aaa.ABC.DEF ABC.DEF.1
  <int> <int>   <int> <int>       <int>       <int>     <int>
1     2     1       1     2           3           2         6
2     3     2       2     3           4           3         9
3     4     3       3     4           5           4        12
4     5     4       4     5           6           5        15
英文:

You can use a combination of rowwise and c_across to do the job.

library(dplyr)

df %&gt;% rowwise() %&gt;% 
  mutate(ABC.DEF.1 = sum(c_across(contains(&quot;ABC.DEF&quot;)))) %&gt;% 
  ungroup()

# A tibble: 4 &#215; 7
    aaa   bbb ABC.DEF   DEF ABC.DEF.GHI aaa.ABC.DEF ABC.DEF.1
  &lt;int&gt; &lt;int&gt;   &lt;int&gt; &lt;int&gt;       &lt;int&gt;       &lt;int&gt;     &lt;int&gt;
1     2     1       1     2           3           2         6
2     3     2       2     3           4           3         9
3     4     3       3     4           5           4        12
4     5     4       4     5           6           5        15

答案2

得分: 1

使用dplyrrowSums(),你可以得到以下输出:

  ABC.DEF ABC.DEF.GHI aaa.ABC.DEF ABC.DEF.SUM
1       1           3           2           6
2       2           4           3           9
3       3           5           4          12
4       4           6           5          15

请注意,这是代码的输出结果。

英文:

With dplyr and rowSums(),

require(dplyr)
df &lt;- data.frame(&quot;aaa&quot; = 2:5, &quot;bbb&quot; = 1:4, &quot;ABC DEF&quot; = 1:4, &quot;DEF&quot; = 2:5, &quot;ABC DEF GHI&quot; = 3:6, &quot;aaa ABC DEF&quot; = 2:5)

df %&gt;% 
  select(contains(&#39;ABC.DEF&#39;)) %&gt;%  
  mutate(ABC.DEF.SUM = rowSums(across(everything())))

Output

  ABC.DEF ABC.DEF.GHI aaa.ABC.DEF ABC.DEF.SUM
1       1           3           2           6
2       2           4           3           9
3       3           5           4          12
4       4           6           5          15

huangapple
  • 本文由 发表于 2023年2月6日 15:19:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75358369.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定