英文:
How to coalesce columns that have NA using dplyr?
问题
我正在寻找一种使用dplyr组合具有多个列中的NA的行的方法。我还没有找到解决这个问题的方法。我对R还很陌生,提前道歉。
我想要将这个示例数据框更改为如下所示:
# Groups:   RID, FlankerCongruence, NoiseLevel 
   RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT  
  <int> <chr>                     <int>      <int>        <int> 
1   101 Congruent                     1          0          678     
2   101 Congruent                     1          0          909     
3   101 Congruent                     1          0          928     
我首先尝试使用以下方法:
coalesce_by_column <- function(noMeancombinedNoiseTable.data) {
  return(dplyr::coalesce(!!! as.list(noMeancombinedNoiseTable.data)))
}
noMeancombinedNoiseTable.data <- noMeancombinedNoiseTable.data %>%
  group_by(RID, FlankerCongruence, NoiseLevel) %>%
  arrange(RID, FlankerCongruence, NoiseLevel) %>%
  summarise_all(coalesce_by_column)
但是这将列汇总为以下形式:
    RID FlankerCongruence NoiseLevel TrialStim.ACC TrialStim.RT 
   <int> <chr>                  <int>         <int>        <int> 
 1   101 Congruent                  0             1          678
有什么建议吗?
英文:
I am looking to combine rows that have NA across multiple columns using dplyr. I have not been able to find a way around this problem. I am still new to R, so apologies in advance.
I want to change this example data frame:
# Groups:   RID, FlankerCongruence, NoiseLevel 
     RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT  
   <int> <chr>                     <int>      <int>        <int> 
 1   101 Congruent                     1          0           NA    
 2   101 Congruent                     1          0           NA     
 3   101 Congruent                     1          0           NA    
 4   101 Congruent                    NA          0          678   
 5   101 Congruent                    NA          0          909   
 6   101 Congruent                    NA          0          928   
into something that looks like this:
 # Groups:   RID, FlankerCongruence, NoiseLevel 
    RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT  
   <int> <chr>                     <int>      <int>        <int> 
 1   101 Congruent                     1          0          678     
 2   101 Congruent                     1          0          909     
 3   101 Congruent                     1          0          928     
I first tried using this:
coalesce_by_column <- function(noMeancombinedNoiseTable.data) {
  return(dplyr::coalesce(!!! as.list(noMeancombinedNoiseTable.data)))
}
noMeancombinedNoiseTable.data <- noMeancombinedNoiseTable.data %>%
  group_by(RID, FlankerCongruence, NoiseLevel) %>% 
  arrange(RID, FlankerCongruence, NoiseLevel) %>%
  summarise_all(coalesce_by_column)
But that summarized columns like this:
    RID FlankerCongruence NoiseLevel TrialStim.ACC TrialStim.RT 
   <int> <chr>                  <int>         <int>        <int> 
 1   101 Congruent                  0             1          678    
Any suggestions??
答案1
得分: 2
我们可以按列进行分组,重新排列其他列中的NAs,并保留至少有一个非NA值的行。
library(dplyr)
noMeancombinedNoiseTable.data %>%
  group_by(RID, FlankerCongruence, NoiseLevel) %>%
  mutate(across(everything(), ~ .x[order(!is.na(.x))])) %>%
  filter(if_any(everything(), ~ !is.na(.x))) %>%
  ungroup
输出结果:
# A tibble: 3 × 5
    RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  <int> <chr>                     <int>      <int>        <int>
1   101 Congruent                     1          0          678
2   101 Congruent                     1          0          909
3   101 Congruent                     1          0          928
数据
noMeancombinedNoiseTable.data <- structure(list(RID = c(101L, 101L, 
101L, 101L, 101L, 101L), FlankerCongruence = c("Congruent", 
"Congruent", "Congruent", "Congruent", "Congruent", "Congruent"
), TrialStim.ACC = c(1L, 1L, 1L, NA, NA, NA), NoiseLevel = c(0L, 
0L, 0L, 0L, 0L, 0L), TrialStim.RT = c(NA, NA, NA, 678L, 909L, 
928L)), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5", "6"))
英文:
We could group by the columns, realign the NAs in other columns and keep the rows with at least one non-NA value
library(dplyr)
noMeancombinedNoiseTable.data %>%
  group_by(RID, FlankerCongruence, NoiseLevel) %>%
  mutate(across(everything(), ~ .x[order(!is.na(.x))])) %>%
  filter(if_any(everything(), ~ !is.na(.x))) %>%
  ungroup
-output
# A tibble: 3 × 5
    RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  <int> <chr>                     <int>      <int>        <int>
1   101 Congruent                     1          0          678
2   101 Congruent                     1          0          909
3   101 Congruent                     1          0          928
data
noMeancombinedNoiseTable.data <- structure(list(RID = c(101L, 101L, 
101L, 101L, 101L, 101L), FlankerCongruence = c("Congruent", 
"Congruent", "Congruent", "Congruent", "Congruent", "Congruent"
), TrialStim.ACC = c(1L, 1L, 1L, NA, NA, NA), NoiseLevel = c(0L, 
0L, 0L, 0L, 0L, 0L), TrialStim.RT = c(NA, NA, NA, 678L, 909L, 
928L)), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5", "6"))
</details>
				通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论