更改一系列列的值,如果列名符合另一列的值。

huangapple go评论65阅读模式
英文:

Mutate value of a range of columns if columns name meets another column value

问题

我有一个wide df,其中columns表示许多给定yearsmonths和每个month中颜色的变化:

df <- data.frame(id = as.integer(c(123,124,125,126)),
                 no_change = as.character(c("May.2010", NA, NA, "Sep.2010")),
                 `Jan.2010` = as.character(c("green", "black", "pink", "grey")),
                 `Feb.2010` = as.character(c("green", "black", "pink", "grey")),
                 `Mar.2010` = as.character(c("green", "red", "pink", "grey")),
                 `Apr.2010` = as.character(c("green", "red", "pink", "grey")),
                 `May.2010` = as.character(c("green", "red", "pink", "grey")),
                 `Jun.2010` = as.character(c("green", "red", "pink", "grey")),
                 `Jul.2010` = as.character(c("green", "white", "pink", "grey")),
                 `Ago.2010` = as.character(c("red", "white", "pink", "grey")),
                 `Sep.2010` = as.character(c("red", "white", "pink", "grey")),
                 `Oct.2010` = as.character(c("red", "white", "pink", "grey")),
                 `Nov.2010` = as.character(c("red", "white", "pink", "grey")),
                 `Dez.2010` = as.character(c("red", "white", "grey", "blue"))
                 )
df
   id no_change Jan.2010 Feb.2010 Mar.2010 Apr.2010 May.2010 Jun.2010 Jul.2010 Ago.2010 Sep.2010 Oct.2010 Nov.2010 Dez.2010
1 123  May.2010    green    green    green    green    green    green    green      red      red      red      red      red
2 124      <NA>    black    black      red      red      red      red    white    white    white    white    white    white
3 125      <NA>     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     grey
4 126  Sep.2010     grey     grey     grey     grey     grey     grey     grey     grey     grey     grey     grey     blue

我想要对每个column应用NA,如果包含的month等于或在column 'no_change' 中指定的month及以上。这是期望的output

   id no_change Jan.2010 Feb.2010 Mar.2010 Apr.2010 May.2010 Jun.2010 Jul.2010 Ago.2010 Sep.2010 Oct.2010 Nov.2010 Dez.2010
1 123  May.2010    green    green    green    green       NA       NA       NA       NA       NA       NA       NA       NA
2 124      <NA>    black    black      red      red      red      red    white    white    white    white    white    white
3 125      <NA>     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     grey
4 126  Sep.2010     grey     grey     grey     grey     grey     grey     grey     grey       NA       NA       NA       NA
英文:

I have a wide df with columns representing the months of many given years and the changes of colour in each month:

df &lt;- data.frame(id = as.integer(c(123,124,125,126)),
                 no_change = as.character(c(&quot;May.2010&quot;, NA, NA, &quot;Sep.2010&quot;)),
                 `Jan.2010` = as.character(c(&quot;green&quot;, &quot;black&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Feb.2010` = as.character(c(&quot;green&quot;, &quot;black&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Mar.2010` = as.character(c(&quot;green&quot;, &quot;red&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Apr.2010` = as.character(c(&quot;green&quot;, &quot;red&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `May.2010` = as.character(c(&quot;green&quot;, &quot;red&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Jun.2010` = as.character(c(&quot;green&quot;, &quot;red&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Jul.2010` = as.character(c(&quot;green&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Ago.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Sep.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Oct.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Nov.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Dez.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;grey&quot;, &quot;blue&quot;))
                 )
df     
   id no_change Jan.2010 Feb.2010 Mar.2010 Apr.2010 May.2010 Jun.2010 Jul.2010 Ago.2010 Sep.2010 Oct.2010 Nov.2010 Dez.2010
1 123  May.2010    green    green    green    green    green    green    green      red      red      red      red      red
2 124      &lt;NA&gt;    black    black      red      red      red      red    white    white    white    white    white    white
3 125      &lt;NA&gt;     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     grey
4 126  Sep.2010     grey     grey     grey     grey     grey     grey     grey     grey     grey     grey     grey     blue     

I want to apply NA to each column that contains a month equal to and above that specified in column 'no_change'. This is the desired output:

   id no_change Jan.2010 Feb.2010 Mar.2010 Apr.2010 May.2010 Jun.2010 Jul.2010 Ago.2010 Sep.2010 Oct.2010 Nov.2010 Dez.2010
1 123  May.2010    green    green    green    green       NA       NA       NA       NA       NA       NA       NA       NA
2 124      &lt;NA&gt;    black    black      red      red      red      red    white    white    white    white    white    white
3 125      &lt;NA&gt;     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     grey
4 126  Sep.2010     grey     grey     grey     grey     grey     grey     grey     grey       NA       NA       NA       NA

答案1

得分: 2

这是使用dplyr::across()和自定义函数的一种方法:

library(dplyr)

set_na <- function(x, dat) {
  
  col_nm <- cur_column()
  col_dat <- lubridate::dmy(paste0("01.", col_nm))
  
  if_else(col_dat <= dat | is.na(dat), x, NA)
  
}

df |&gt; 
  mutate(
    no_change_dat = lubridate::dmy(paste0("01.", no_change)),
    across(-c(id, no_change, no_change_dat), \(x) set_na(x, no_change_dat))
    )

数据来自OP:

df <- data.frame(id = as.integer(c(123,124,125,126)),
                 no_change = as.character(c("May.2010", NA, NA, "Sep.2010")),
                 `Jan.2010` = as.character(c("green", "black", "pink", "grey")),
                 `Feb.2010` = as.character(c("green", "black", "pink", "grey")),
                 `Mar.2010` = as.character(c("green", "red", "pink", "grey")),
                 `Apr.2010` = as.character(c("green", "red", "pink", "grey")),
                 `May.2010` = as.character(c("green", "red", "pink", "grey")),
                 `Jun.2010` = as.character(c("green", "red", "pink", "grey")),
                 `Jul.2010` = as.character(c("green", "white", "pink", "grey")),
                 `Ago.2010` = as.character(c("red", "white", "pink", "grey")),
                 `Sep.2010` = as.character(c("red", "white", "pink", "grey")),
                 `Oct.2010` = as.character(c("red", "white", "pink", "grey")),
                 `Nov.2010` = as.character(c("red", "white", "pink", "grey")),
                 `Dez.2010` = as.character(c("red", "white", "grey", "blue"))
)

在2023-06-19使用reprex v2.0.2创建

英文:

Here is one way to do it using dplyr::across() and a custom function:

library(dplyr)

set_na &lt;- function(x, dat) {
  
  col_nm &lt;- cur_column()
  col_dat &lt;- lubridate::dmy(paste0(&quot;01.&quot;, col_nm))
  
  if_else(col_dat &lt;= dat | is.na(dat), x, NA)
  
}


df |&gt; 
  mutate(
    no_change_dat = lubridate::dmy(paste0(&quot;01.&quot;, no_change)),
    across(-c(id, no_change, no_change_dat), \(x) set_na(x, no_change_dat))
    )

#&gt; Warning: There were 2 warnings in `mutate()`.
#&gt; The first warning was:
#&gt; ℹ In argument: `no_change_dat = lubridate::dmy(paste0(&quot;01.&quot;, no_change))`.
#&gt; Caused by warning:
#&gt; !  2 failed to parse.
#&gt; ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
#&gt;    id no_change Jan.2010 Feb.2010 Mar.2010 Apr.2010 May.2010 Jun.2010 Jul.2010
#&gt; 1 123  May.2010    green    green    green    green    green     &lt;NA&gt;     &lt;NA&gt;
#&gt; 2 124      &lt;NA&gt;    black    black      red      red      red      red    white
#&gt; 3 125      &lt;NA&gt;     pink     pink     pink     pink     pink     pink     pink
#&gt; 4 126  Sep.2010     grey     grey     grey     grey     grey     grey     grey
#&gt;   Ago.2010 Sep.2010 Oct.2010 Nov.2010 Dez.2010 no_change_dat
#&gt; 1     &lt;NA&gt;     &lt;NA&gt;     &lt;NA&gt;     &lt;NA&gt;     &lt;NA&gt;    2010-05-01
#&gt; 2    white    white    white    white    white          &lt;NA&gt;
#&gt; 3     pink     pink     pink     pink     grey          &lt;NA&gt;
#&gt; 4     &lt;NA&gt;     grey     &lt;NA&gt;     &lt;NA&gt;     &lt;NA&gt;    2010-09-01

Data from OP

df &lt;- data.frame(id = as.integer(c(123,124,125,126)),
                 no_change = as.character(c(&quot;May.2010&quot;, NA, NA, &quot;Sep.2010&quot;)),
                 `Jan.2010` = as.character(c(&quot;green&quot;, &quot;black&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Feb.2010` = as.character(c(&quot;green&quot;, &quot;black&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Mar.2010` = as.character(c(&quot;green&quot;, &quot;red&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Apr.2010` = as.character(c(&quot;green&quot;, &quot;red&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `May.2010` = as.character(c(&quot;green&quot;, &quot;red&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Jun.2010` = as.character(c(&quot;green&quot;, &quot;red&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Jul.2010` = as.character(c(&quot;green&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Ago.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Sep.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Oct.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Nov.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;pink&quot;, &quot;grey&quot;)),
                 `Dez.2010` = as.character(c(&quot;red&quot;, &quot;white&quot;, &quot;grey&quot;, &quot;blue&quot;))
)

<sup>Created on 2023-06-19 with reprex v2.0.2</sup>

答案2

得分: 2

你可以将格式“枢轴”为“长”格式,并找出哪些行应该变成“NA”。

library(tidyverse)

df %>%
  pivot_longer(ends_with("2010")) %>%
  group_by(id) %>%
  mutate(value = ifelse(cumsum(name == no_change & !is.na(no_change)), NA, value)) %>%
  pivot_wider() %>%
  ungroup()

# A tibble: 4 × 14
     id no_change Jan.2010 Feb.2010 Mar.2010 Apr.2010 May.2010 Jun.2010 Jul.2010 Ago.2010 Sep.2010 Oct.2010 Nov.2010 Dez.2010
  <int> <chr>     <chr>    <chr>    <chr>    <chr>    <chr>    <chr>    <chr>    <chr>    <chr>    <chr>    <chr>    <chr>   
1   123 May.2010  green    green    green    green    NA       NA       NA       NA       NA       NA       NA       NA      
2   124 NA        black    black    red      red      red      red      white    white    white    white    white    white   
3   125 NA        pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     grey    
4   126 Sep.2010  grey     grey     grey     grey     grey     grey     grey     grey     NA       NA       NA       NA      
英文:

You can pivot the format into a "long" format, and find out which rows should be turned into NA.

library(tidyverse)

df %&gt;% 
  pivot_longer(ends_with(&quot;2010&quot;)) %&gt;% 
  group_by(id) %&gt;% 
  mutate(value = ifelse(cumsum(name == no_change &amp; !is.na(no_change)), NA, value)) %&gt;% 
  pivot_wider() %&gt;% 
  ungroup()

# A tibble: 4 &#215; 14
     id no_change Jan.2010 Feb.2010 Mar.2010 Apr.2010 May.2010 Jun.2010 Jul.2010 Ago.2010 Sep.2010 Oct.2010 Nov.2010 Dez.2010
  &lt;int&gt; &lt;chr&gt;     &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;   
1   123 May.2010  green    green    green    green    NA       NA       NA       NA       NA       NA       NA       NA      
2   124 NA        black    black    red      red      red      red      white    white    white    white    white    white   
3   125 NA        pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     pink     grey    
4   126 Sep.2010  grey     grey     grey     grey     grey     grey     grey     grey     NA       NA       NA       NA      

</details>



huangapple
  • 本文由 发表于 2023年6月19日 22:41:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76507694.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定