如何使用dplyr合并具有NA的列?

huangapple go评论67阅读模式
英文:

How to coalesce columns that have NA using dplyr?

问题

我正在寻找一种使用dplyr组合具有多个列中的NA的行的方法。我还没有找到解决这个问题的方法。我对R还很陌生,提前道歉。

我想要将这个示例数据框更改为如下所示:

# Groups:   RID, FlankerCongruence, NoiseLevel 
   RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT  
  <int> <chr>                     <int>      <int>        <int> 
1   101 Congruent                     1          0          678     
2   101 Congruent                     1          0          909     
3   101 Congruent                     1          0          928     

我首先尝试使用以下方法:

coalesce_by_column <- function(noMeancombinedNoiseTable.data) {
  return(dplyr::coalesce(!!! as.list(noMeancombinedNoiseTable.data)))
}

noMeancombinedNoiseTable.data <- noMeancombinedNoiseTable.data %>%
  group_by(RID, FlankerCongruence, NoiseLevel) %>%
  arrange(RID, FlankerCongruence, NoiseLevel) %>%
  summarise_all(coalesce_by_column)

但是这将列汇总为以下形式:

    RID FlankerCongruence NoiseLevel TrialStim.ACC TrialStim.RT 
   <int> <chr>                  <int>         <int>        <int> 
 1   101 Congruent                  0             1          678

有什么建议吗?

英文:

I am looking to combine rows that have NA across multiple columns using dplyr. I have not been able to find a way around this problem. I am still new to R, so apologies in advance.

I want to change this example data frame:

# Groups:   RID, FlankerCongruence, NoiseLevel 
     RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT  
   &lt;int&gt; &lt;chr&gt;                     &lt;int&gt;      &lt;int&gt;        &lt;int&gt; 
 1   101 Congruent                     1          0           NA    
 2   101 Congruent                     1          0           NA     
 3   101 Congruent                     1          0           NA    
 4   101 Congruent                    NA          0          678   
 5   101 Congruent                    NA          0          909   
 6   101 Congruent                    NA          0          928   

into something that looks like this:

 # Groups:   RID, FlankerCongruence, NoiseLevel 
    RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT  
   &lt;int&gt; &lt;chr&gt;                     &lt;int&gt;      &lt;int&gt;        &lt;int&gt; 
 1   101 Congruent                     1          0          678     
 2   101 Congruent                     1          0          909     
 3   101 Congruent                     1          0          928     

I first tried using this:

coalesce_by_column &lt;- function(noMeancombinedNoiseTable.data) {
  return(dplyr::coalesce(!!! as.list(noMeancombinedNoiseTable.data)))
}

noMeancombinedNoiseTable.data &lt;- noMeancombinedNoiseTable.data %&gt;%
  group_by(RID, FlankerCongruence, NoiseLevel) %&gt;% 
  arrange(RID, FlankerCongruence, NoiseLevel) %&gt;%
  summarise_all(coalesce_by_column)

But that summarized columns like this:

    RID FlankerCongruence NoiseLevel TrialStim.ACC TrialStim.RT 
   &lt;int&gt; &lt;chr&gt;                  &lt;int&gt;         &lt;int&gt;        &lt;int&gt; 
 1   101 Congruent                  0             1          678    

Any suggestions??

答案1

得分: 2

我们可以按列进行分组,重新排列其他列中的NAs,并保留至少有一个非NA值的行。

library(dplyr)
noMeancombinedNoiseTable.data %>%
  group_by(RID, FlankerCongruence, NoiseLevel) %>%
  mutate(across(everything(), ~ .x[order(!is.na(.x))])) %>%
  filter(if_any(everything(), ~ !is.na(.x))) %>%
  ungroup

输出结果:

# A tibble: 3 × 5
    RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  <int> <chr>                     <int>      <int>        <int>
1   101 Congruent                     1          0          678
2   101 Congruent                     1          0          909
3   101 Congruent                     1          0          928

数据

noMeancombinedNoiseTable.data <- structure(list(RID = c(101L, 101L, 
101L, 101L, 101L, 101L), FlankerCongruence = c("Congruent", 
"Congruent", "Congruent", "Congruent", "Congruent", "Congruent"
), TrialStim.ACC = c(1L, 1L, 1L, NA, NA, NA), NoiseLevel = c(0L, 
0L, 0L, 0L, 0L, 0L), TrialStim.RT = c(NA, NA, NA, 678L, 909L, 
928L)), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5", "6"))
英文:

We could group by the columns, realign the NAs in other columns and keep the rows with at least one non-NA value

library(dplyr)
noMeancombinedNoiseTable.data %&gt;%
  group_by(RID, FlankerCongruence, NoiseLevel) %&gt;%
  mutate(across(everything(), ~ .x[order(!is.na(.x))])) %&gt;%
  filter(if_any(everything(), ~ !is.na(.x))) %&gt;%
  ungroup

-output

# A tibble: 3 &#215; 5
    RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  &lt;int&gt; &lt;chr&gt;                     &lt;int&gt;      &lt;int&gt;        &lt;int&gt;
1   101 Congruent                     1          0          678
2   101 Congruent                     1          0          909
3   101 Congruent                     1          0          928

data

noMeancombinedNoiseTable.data &lt;- structure(list(RID = c(101L, 101L, 
101L, 101L, 101L, 101L), FlankerCongruence = c(&quot;Congruent&quot;, 
&quot;Congruent&quot;, &quot;Congruent&quot;, &quot;Congruent&quot;, &quot;Congruent&quot;, &quot;Congruent&quot;
), TrialStim.ACC = c(1L, 1L, 1L, NA, NA, NA), NoiseLevel = c(0L, 
0L, 0L, 0L, 0L, 0L), TrialStim.RT = c(NA, NA, NA, 678L, 909L, 
928L)), class = &quot;data.frame&quot;, row.names = c(&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;, 
&quot;5&quot;, &quot;6&quot;))


</details>



huangapple
  • 本文由 发表于 2023年4月4日 13:50:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/75925888.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定