在R中,将一些列的值合并后,向数据框添加一列。

huangapple go评论136阅读模式
英文:

Add a column to data frame resulting from the merging of some values from other columns with R

问题

我有一个包含许多列的数据框,其中包括以下内容:

COLUMN_1   COLUMN_2   COLUMN_3
red        blue       green
none       blue       none
red        none       green
red        none       none
none       none       none

我想基于一个布尔值添加第四列COLORS,对于COLUMN_1的每个值为'red',COLUMN_2的每个值为'blue',或者COLUMN_3的每个值为'green',都会给出'True',如果COLUMN_1,COLUMN_2和COLUMN_3的值始终为'none',则给出'False'。就像这样:

COLUMN_1   COLUMN_2   COLUMN_3    COLORS
red        blue       green      True
none       blue       none       True
red        none       green      True
red        none       none       True
none       none       none       False

一个不带布尔值的列也可以:

COLUMN_1   COLUMN_2   COLUMN_3   COLORS
red        blue       green      Color
none       blue       none       Color
red        none       green      Color
red        none       none       Color
none       none       none       No color
英文:

I have a data frame with many columns, among which I have the following:

COLUMN_1   COLUMN_2   COLUMN_3
red        blue       green
none       blue       none
red        none       green
red        none       none
none       none       none

I'd like to add a fourth column called COLORS based on a boolean which gives 'True' for every value 'red' of COLUMN_1, 'blue' of COLUMN_2, or 'green' of COLUMN_3, and gives 'False' if the value of COLUMN_1, COLUMN_2 and COLUMN_3 is always 'none'. Like this:

   COLUMN_1   COLUMN_2   COLUMN_3    COLORS
    red        blue       green      True
    none       blue       none       True
    red        none       green      True
    red        none       none       True
    none       none       none       False

A column without a boolean would also work:

COLUMN_1   COLUMN_2   COLUMN_3   COLORS
red        blue       green      Color
none       blue       none       Color
red        none       green      Color
red        none       none       Color
none       none       none       No color

答案1

得分: 2

Here is the translated code without the translation of code-related terms:

library(dplyr)

df %>%
  mutate(
    COLORS = case_when(
      if_all(contains("COLUMN"), ~ .x == "none") ~ "无颜色",
      TRUE ~ "有颜色"
    ))

And the table section:

# A tibble: 5 × 4
  COLUMN_1 COLUMN_2 COLUMN_3 COLORS  
  <chr>    <chr>    <chr>    <chr>   
1 red      blue     green    有颜色   
2 none     blue     none     有颜色   
3 red      none     green    有颜色   
4 red      none     none     有颜色   
5 none     none     none     无颜色
英文:
library(dplyr) 

df %&gt;%  
  mutate(
    COLORS = case_when(
      if_all(contains(&quot;COLUMN&quot;), ~ .x == &quot;none&quot;) ~ &quot;No color&quot;, 
      TRUE ~ &quot;Color&quot;
    ))

# A tibble: 5 &#215; 4
  COLUMN_1 COLUMN_2 COLUMN_3 COLORS  
  &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;   
1 red      blue     green    Color   
2 none     blue     none     Color   
3 red      none     green    Color   
4 red      none     none     Color   
5 none     none     none     No color

答案2

得分: 2

一种选择可能是:

df %>%
  rowwise() %>%
  mutate(COLORS = any(across(everything()) == c("red", "blue", "green")))

  COLUMN_1 COLUMN_2 COLUMN_3 COLORS
  <chr>    <chr>    <chr>    <lgl> 
1 red      blue     green    TRUE  
2 none     blue     none     TRUE  
3 red      none     green    TRUE  
4 red      none     none     TRUE  
5 none     none     none     FALSE

如果某个列的颜色与预期不同,它将返回 FALSE:

  COLUMN_1 COLUMN_2 COLUMN_3 COLORS
  <chr>    <chr>    <chr>    <lgl> 
1 red      blue     green    TRUE  
2 none     blue     none     TRUE  
3 red      none     green    TRUE  
4 red      none     none     TRUE  
5 blue     none     none     FALSE
英文:

One option could be:

df %&gt;%
 rowwise() %&gt;%
 mutate(COLORS = any(across(everything()) == c(&quot;red&quot;, &quot;blue&quot;, &quot;green&quot;)))

  COLUMN_1 COLUMN_2 COLUMN_3 COLORS
  &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;lgl&gt; 
1 red      blue     green    TRUE  
2 none     blue     none     TRUE  
3 red      none     green    TRUE  
4 red      none     none     TRUE  
5 none     none     none     FALSE

If a given column would have a different color as expected, the it would return FALSE:

  COLUMN_1 COLUMN_2 COLUMN_3 COLORS
  &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;lgl&gt; 
1 red      blue     green    TRUE  
2 none     blue     none     TRUE  
3 red      none     green    TRUE  
4 red      none     none     TRUE  
5 blue     none     none     FALSE

答案3

得分: 0

以下是您要翻译的内容:

"这并不是一个高效的答案,而是一个替代的答案

df2 &lt;- df %&gt;% 
  mutate(across(c(COLUMN_1,COLUMN_2,COLUMN_3), ~ ifelse(.x==&#39;none&#39;, 1, 0), .names = &#39;n_{.col}&#39;),
         new=rowSums(across(c(n_COLUMN_1,n_COLUMN_2,n_COLUMN_3))),
         COLORS=ifelse(new==3,&#39;no color&#39;, &#39;color&#39;)) %&gt;% 
  select(-contains(&#39;n_c&#39;), -new)

<sup>创建于2023-06-08,使用 reprex v2.0.2</sup>

# A tibble: 5 &#215; 4
  COLUMN_1 COLUMN_2 COLUMN_3 COLORS  
  &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;   
1 红色      蓝色     绿色    有颜色   
2 无       蓝色     无      有颜色   
3 红色      无      绿色    有颜色   
4 红色      无      无      有颜色   
5 无       无      无      无颜色
英文:

It is not an efficient answer but an alternate

df2 &lt;- df %&gt;% 
  mutate(across(c(COLUMN_1,COLUMN_2,COLUMN_3), ~ ifelse(.x==&#39;none&#39;, 1, 0), .names = &#39;n_{.col}&#39;),
         new=rowSums(across(c(n_COLUMN_1,n_COLUMN_2,n_COLUMN_3))),
         COLORS=ifelse(new==3,&#39;no color&#39;, &#39;color&#39;)) %&gt;% 
  select(-contains(&#39;n_c&#39;), -new)

<sup>Created on 2023-06-08 with reprex v2.0.2</sup>

# A tibble: 5 &#215; 4
  COLUMN_1 COLUMN_2 COLUMN_3 COLORS  
  &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;   
1 red      blue     green    color   
2 none     blue     none     color   
3 red      none     green    color   
4 red      none     none     color   
5 none     none     none     no color

答案4

得分: 0

我们可以尝试使用 rowSums + col,如下所示:

transform(
    df,
    COLORS = rowSums(df == c("red", "blue", "green")[col(df)]) > 0
)

这将得到以下结果:

  COLUMN_1 COLUMN_2 COLUMN_3 COLORS
1      red     blue    green   TRUE
2     none     blue     none   TRUE
3      red     none    green   TRUE
4      red     none     none   TRUE
5     none     none     none  FALSE
英文:

We can try rowSums + col like below

transform(
    df,
    COLORS = rowSums(df == c(&quot;red&quot;, &quot;blue&quot;, &quot;green&quot;)[col(df)]) &gt; 0
)

which gives

  COLUMN_1 COLUMN_2 COLUMN_3 COLORS
1      red     blue    green   TRUE
2     none     blue     none   TRUE
3      red     none    green   TRUE
4      red     none     none   TRUE
5     none     none     none  FALSE

答案5

得分: 0

library(dplyr)

df |&gt;
  mutate(COLORS = rowSums(pick(starts_with(&quot;COLUMN&quot;)) != &quot;none&quot;) &gt; 0)

注意:如果存在颜色“yellow”,则返回TRUE

英文:
library(dplyr)

df |&gt;
  mutate(COLORS = rowSums(pick(starts_with(&quot;COLUMN&quot;)) != &quot;none&quot;) &gt; 0)

Note: this will return TRUE if there is a color "yellow".

huangapple
  • 本文由 发表于 2023年6月8日 22:12:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76432758.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定