如何使用dplyr合并具有NA的列?

huangapple go评论96阅读模式
英文:

How to coalesce columns that have NA using dplyr?

问题

我正在寻找一种使用dplyr组合具有多个列中的NA的行的方法。我还没有找到解决这个问题的方法。我对R还很陌生,提前道歉。

我想要将这个示例数据框更改为如下所示:

  1. # Groups: RID, FlankerCongruence, NoiseLevel
  2. RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  3. <int> <chr> <int> <int> <int>
  4. 1 101 Congruent 1 0 678
  5. 2 101 Congruent 1 0 909
  6. 3 101 Congruent 1 0 928

我首先尝试使用以下方法:

  1. coalesce_by_column <- function(noMeancombinedNoiseTable.data) {
  2. return(dplyr::coalesce(!!! as.list(noMeancombinedNoiseTable.data)))
  3. }
  4. noMeancombinedNoiseTable.data <- noMeancombinedNoiseTable.data %>%
  5. group_by(RID, FlankerCongruence, NoiseLevel) %>%
  6. arrange(RID, FlankerCongruence, NoiseLevel) %>%
  7. summarise_all(coalesce_by_column)

但是这将列汇总为以下形式:

  1. RID FlankerCongruence NoiseLevel TrialStim.ACC TrialStim.RT
  2. <int> <chr> <int> <int> <int>
  3. 1 101 Congruent 0 1 678

有什么建议吗?

英文:

I am looking to combine rows that have NA across multiple columns using dplyr. I have not been able to find a way around this problem. I am still new to R, so apologies in advance.

I want to change this example data frame:

  1. # Groups: RID, FlankerCongruence, NoiseLevel
  2. RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  3. &lt;int&gt; &lt;chr&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
  4. 1 101 Congruent 1 0 NA
  5. 2 101 Congruent 1 0 NA
  6. 3 101 Congruent 1 0 NA
  7. 4 101 Congruent NA 0 678
  8. 5 101 Congruent NA 0 909
  9. 6 101 Congruent NA 0 928

into something that looks like this:

  1. # Groups: RID, FlankerCongruence, NoiseLevel
  2. RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  3. &lt;int&gt; &lt;chr&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
  4. 1 101 Congruent 1 0 678
  5. 2 101 Congruent 1 0 909
  6. 3 101 Congruent 1 0 928

I first tried using this:

  1. coalesce_by_column &lt;- function(noMeancombinedNoiseTable.data) {
  2. return(dplyr::coalesce(!!! as.list(noMeancombinedNoiseTable.data)))
  3. }
  4. noMeancombinedNoiseTable.data &lt;- noMeancombinedNoiseTable.data %&gt;%
  5. group_by(RID, FlankerCongruence, NoiseLevel) %&gt;%
  6. arrange(RID, FlankerCongruence, NoiseLevel) %&gt;%
  7. summarise_all(coalesce_by_column)

But that summarized columns like this:

  1. RID FlankerCongruence NoiseLevel TrialStim.ACC TrialStim.RT
  2. &lt;int&gt; &lt;chr&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
  3. 1 101 Congruent 0 1 678

Any suggestions??

答案1

得分: 2

我们可以按列进行分组,重新排列其他列中的NAs,并保留至少有一个非NA值的行。

  1. library(dplyr)
  2. noMeancombinedNoiseTable.data %>%
  3. group_by(RID, FlankerCongruence, NoiseLevel) %>%
  4. mutate(across(everything(), ~ .x[order(!is.na(.x))])) %>%
  5. filter(if_any(everything(), ~ !is.na(.x))) %>%
  6. ungroup

输出结果:

  1. # A tibble: 3 × 5
  2. RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  3. <int> <chr> <int> <int> <int>
  4. 1 101 Congruent 1 0 678
  5. 2 101 Congruent 1 0 909
  6. 3 101 Congruent 1 0 928

数据

  1. noMeancombinedNoiseTable.data <- structure(list(RID = c(101L, 101L,
  2. 101L, 101L, 101L, 101L), FlankerCongruence = c("Congruent",
  3. "Congruent", "Congruent", "Congruent", "Congruent", "Congruent"
  4. ), TrialStim.ACC = c(1L, 1L, 1L, NA, NA, NA), NoiseLevel = c(0L,
  5. 0L, 0L, 0L, 0L, 0L), TrialStim.RT = c(NA, NA, NA, 678L, 909L,
  6. 928L)), class = "data.frame", row.names = c("1", "2", "3", "4",
  7. "5", "6"))
英文:

We could group by the columns, realign the NAs in other columns and keep the rows with at least one non-NA value

  1. library(dplyr)
  2. noMeancombinedNoiseTable.data %&gt;%
  3. group_by(RID, FlankerCongruence, NoiseLevel) %&gt;%
  4. mutate(across(everything(), ~ .x[order(!is.na(.x))])) %&gt;%
  5. filter(if_any(everything(), ~ !is.na(.x))) %&gt;%
  6. ungroup

-output

  1. # A tibble: 3 &#215; 5
  2. RID FlankerCongruence TrialStim.ACC NoiseLevel TrialStim.RT
  3. &lt;int&gt; &lt;chr&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
  4. 1 101 Congruent 1 0 678
  5. 2 101 Congruent 1 0 909
  6. 3 101 Congruent 1 0 928

data

  1. noMeancombinedNoiseTable.data &lt;- structure(list(RID = c(101L, 101L,
  2. 101L, 101L, 101L, 101L), FlankerCongruence = c(&quot;Congruent&quot;,
  3. &quot;Congruent&quot;, &quot;Congruent&quot;, &quot;Congruent&quot;, &quot;Congruent&quot;, &quot;Congruent&quot;
  4. ), TrialStim.ACC = c(1L, 1L, 1L, NA, NA, NA), NoiseLevel = c(0L,
  5. 0L, 0L, 0L, 0L, 0L), TrialStim.RT = c(NA, NA, NA, 678L, 909L,
  6. 928L)), class = &quot;data.frame&quot;, row.names = c(&quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;,
  7. &quot;5&quot;, &quot;6&quot;))
  8. </details>

huangapple
  • 本文由 发表于 2023年4月4日 13:50:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/75925888.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定