从dplyr的多个列的第一个观察中创建新列

huangapple go评论124阅读模式
英文:

New column from first observation of several columns dplyr

问题

我有以下数据,想要从A、B和C创建"New"变量:

structure(list(A = c("NA", "NA", "4", "NA"), B = c("NA", "3", "4", "5"), C = c("1", "NA", "NA", "5"), New = c(1, 3, 4, 5)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L))

我只想要从任何一列中获取第一个非-NA观察值。
我尝试在dplyr中使用```across```,但无法弄清语法。
有任何想法?
英文:

I have the following data and want to create the "New" variable from A, B, and C:

structure(list(A = c("NA", "NA", "4", "NA"), B = c("NA", "3", 
"4", "5"), C = c("1", "NA", "NA", "5"), New = c(1, 3, 4, 5)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -4L))

I just want the first non-NA observation from any of the columns
I have attempted to use across in dplyr but have not been able to figure out the syntax.
Any ideas?

答案1

得分: 3

使用 pmax 函数:

df <- type.convert(df, as.is = TRUE)
df %>% 
  mutate(New = do.call(pmax, c(across(A:C), na.rm = TRUE)))

# 一个 tibble: 4 × 4
  A     B     C     New  
  <int> <int> <int> <int>
1 NA    NA    1     1    
2 NA    3     NA    3    
3 4     4     NA    4    
4 NA    5     5     5    
英文:

With pmax:

df &lt;- type.convert(df, as.is = TRUE)
df %&gt;% 
  mutate(New = do.call(pmax, c(across(A:C), na.rm = TRUE)))

# A tibble: 4 &#215; 4
  A     B     C     New  
  &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
1 NA    NA    1     1    
2 NA    3     NA    3    
3 4     4     NA    4    
4 NA    5     5     5    

答案2

得分: 2

We may need to first convert the &quot;NA&quot; to NA before extracting the first non-NA

library(dplyr)
library(purrr)
df1 %>%
  type.convert(as.is = TRUE) %>%
  mutate(New = invoke(coalesce, pick(A:C)))

-output

# A tibble: 4 × 4
      A     B     C   New
  <int> <int> <int> <int>
1    NA    NA     1     1
2    NA     3    NA     3
3     4     4    NA     4
4    NA     5     5     5

Or with fcoalecse from data.table

library(data.table)
df1 %>%
  type.convert(as.is = TRUE) %>%
  mutate(New = fcoalesce(pick(A:C)))

请注意,代码部分已排版并保留在原文中。

英文:

We may need to first convert the &quot;NA&quot; to NA before extracting the first non-NA

library(dplyr)
library(purrr)
df1 %&gt;%
  type.convert(as.is = TRUE) %&gt;%
  mutate(New = invoke(coalesce, pick(A:C)))

-output

# A tibble: 4 &#215; 4
      A     B     C   New
  &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
1    NA    NA     1     1
2    NA     3    NA     3
3     4     4    NA     4
4    NA     5     5     5

Or with fcoalecse from data.table

library(data.table)
df1 %&gt;% 
  type.convert(as.is = TRUE) %&gt;% 
  mutate(New = fcoalesce(pick(A:C)))

</details>



# 答案3
**得分**: 1

Base R option using `apply` by selecting the first `[1]` non NA with `is.na` like this:

``` r
df$New <- apply(df, 1, \(x) x[!is.na(x)][1])
df
#>      A    B    C New
#> 1 <NA> <NA>    1   1
#> 2 <NA>    3 <NA>   3
#> 3    4    4 <NA>   4
#> 4 <NA>    5    5   5

Created on 2023-03-03 with reprex v2.0.2

英文:

Base R option using apply by selecting the first [1] non NA with is.na like this:

df$New &lt;- apply(df, 1, \(x) x[!is.na(x)][1])
df
#&gt;      A    B    C New
#&gt; 1 &lt;NA&gt; &lt;NA&gt;    1   1
#&gt; 2 &lt;NA&gt;    3 &lt;NA&gt;   3
#&gt; 3    4    4 &lt;NA&gt;   4
#&gt; 4 &lt;NA&gt;    5    5   5

<sup>Created on 2023-03-03 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年3月4日 02:51:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/75630842.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定