2023年6月15日 20:43:20go评论88阅读模式

英文:

Get index of rows with not equal values across several columns, excluding NA

问题

以下是您要翻译的代码部分：

library(dplyr)
d %>%
  filter(if_all(c(c,e), ~ b == .b))

希望这对您有所帮助。如果您有任何其他问题，请随时提出。

英文:

Using as an example this data frame:

a   b   c   d   e
1   x   x   A   x
2   y   y   A   NA
3   z   v   B   NA
4   x   w   T   w
5   s   NA  K   NA

How could I get as TRUE those rows where values across b, c and e columns are not equal, excluding NAs. The idea is to get TRUE (or the index) for the following rows:

a   b   c   d   e
3   z   v   B   NA
4   x   w   T   w

So, my intention is to get those rows where b, c and e are not equal. But in case some of this rows is NA but the other are equal, this should not count as not equal, as NAs should be ignored.

I was trying something like:

library(dplyr)
d %&gt;% 
  filter(if_all(c(c,e), ~ b == .b))

But this way I get TRUE for equal values and, in addition, I get problems with NA.

Do you know how can I solve this?

Thanks!

答案1

得分: 1

以下是使用dplyr的一个想法，

library(dplyr)
df %>%
  rowwise() %>%
  filter(sum(!is.na(c_across(c('b', 'c', 'e')))) > 1, length(unique(na.omit(c_across(c('b', 'c', 'e'))))) > 1) %>%
  ungroup()

一个 tibble: 2 × 5

  a b     c     d     e

1 3 z v B NA
2 4 x w T w


<details>
<summary>英文:</summary>
Here is an idea using `dplyr`,
    library(dplyr)
    
    df %&gt;%
         rowwise() %&gt;%
         filter(sum(!is.na(c_across(c(&#39;b&#39;, &#39;c&#39;, &#39;e&#39;)))) &gt; 1, length(unique(na.omit(c_across(c(&#39;b&#39;, &#39;c&#39;, &#39;e&#39;))))) &gt; 1) %&gt;%
         ungroup()
    
    # A tibble: 2 &#215; 5
          a b     c     d     e    
      &lt;int&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
    1     3 z     v     B     NA   
    2     4 x     w     T     w    
</details>
# 答案2
**得分**: 1

apply(
df[, c("b", "c", "e")],
1,
function(row) {
row <- row[!is.na(row)]
any(row != row[1])
}
)

#> [1] FALSE FALSE TRUE TRUE FALSE


---
Where `df` is:

df <- read.table(text =
'a b c d e

1 x x A x
2 y y A NA
3 z v B NA
4 x w T w
5 s NA K NA',
header = TRUE)


<details>
<summary>英文:</summary>

apply(
df[, c("b", "c", "e")],
1,
function(row) {
row <- row[!is.na(row)]
any(row != row[1])
}
)

#> [1] FALSE FALSE TRUE TRUE FALSE


---
Where `df` is:

df <- read.table(text =
'a b c d e

1 x x A x
2 y y A NA
3 z v B NA
4 x w T w
5 s NA K NA',
header = TRUE)


</details>
# 答案3
**得分**: 0
我相信OP想要输出对于所有值都是唯一的行而不包括NAs的情况下为TRUE。我们可以使用`table`逐行进行操作，如果表的所有值都为1（没有重复），则输出TRUE。
请记得`pick`所需的列来供给这个函数。
根据这个新的索引变量进行筛选很简单。

df <- data.frame(
id = c(1:4),
a = c('a', 'a', 'a', 'z'),
b = c('b', 'b', 'c', 'a'),
c = c('c', 'b', NA, 'd'))

library(dplyr)

df |>
mutate(index = apply(pick(a:c), 1, table) |>
lapply((x) all(x ==1)
)
)

a b c index
1 a b c TRUE
2 a b b FALSE
3 a c <NA> TRUE
4 z a d TRUE

一个修改后的、更简单的版本，使用`purrr::pmap`:

df |>
mutate(index = pmap(pick(a:c), (...) all(table(c(...)) == 1)))

id a b c index
1 1 a b c TRUE
2 2 a b b FALSE
3 3 a c <NA> TRUE
4 4 z a d TRUE


<details>
<summary>英文:</summary>
I believe the OP wants to output TRUE for rows in which all values are unique, exluding NAs. We can use `table` rowwise and output TRUE if `all` values of the table are `1`(no duplicates). 
Remember to `pick`the desired columns to feed the function.
Filtering on this new index variable is straightforward.

df <- data.frame(
id = c(1:4),
a = c('a', 'a', 'a', 'z'),
b = c('b', 'b', 'c', 'a'),
c = c('c', 'b', NA, 'd'))

library(dplyr)

df |>
mutate(index = apply(pick(a:c), 1, table) |>
lapply((x) all(x ==1)
)
)

a b c index
1 a b c TRUE
2 a b b FALSE
3 a c <NA> TRUE
4 z a d TRUE

A modified, simpler version, with `purrr::pmap`:

df |>
mutate(index = pmap(pick(a:c), (...) all(table(c(...)) == 1)))

id a b c index
1 1 a b c TRUE
2 2 a b b FALSE
3 3 a c <NA> TRUE
4 4 z a d TRUE

答案4

得分: 0

使用一个辅助函数：

library(tidyverse)
data <- tibble(
  a = c(1, 2, 3, 4, 5),
  b = c("x", "y", "z", "x", "s"),
  c = c("x", "y", "v", "w", NA),
  d = c("A", "A", "B", "T", "K"),
  e = c("x", NA, NA, "w", NA)
)
unique_row <- function(input) {
  result <- input %>%
    na.omit() %>%
    unique()
  
  return(length(result) != 1)
}
data %>%
rowwise() %>%
  filter(unique_row(c(b, c, e)))

(Note: I've removed the HTML encoding for characters like "<" and """ to make the code more readable in Chinese. The code should work as expected without these encodings.)

英文:

With a helper function:

library(tidyverse)
data &lt;- tibble(
  a = c(1, 2, 3, 4, 5),
  b = c(&quot;x&quot;, &quot;y&quot;, &quot;z&quot;, &quot;x&quot;, &quot;s&quot;),
  c = c(&quot;x&quot;, &quot;y&quot;, &quot;v&quot;, &quot;w&quot;, NA),
  d = c(&quot;A&quot;, &quot;A&quot;, &quot;B&quot;, &quot;T&quot;, &quot;K&quot;),
  e = c(&quot;x&quot;, NA, NA, &quot;w&quot;, NA)
)
unique_row &lt;- function(input) {
  result &lt;- input %&gt;% 
    na.omit() %&gt;%
    unique()
  
  return(length(result) != 1)
}
data %&gt;%
rowwise() %&gt;%
  filter(unique_row(c(b, c, e)))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

获取跨多列具有不等值的行的索引，排除NA。

问题

答案1

一个 tibble: 2 × 5

答案4

ggplot2绘图出现错误: “由于纬度* pi而引起的错误”

Why does my deSolve model in R stop integrating when I incorporate a conditional source of mortality in my population model?

‘Can’t use `!!!` at top level.’ 的意思是什么，如何解决这个问题？

重现包含该色彩调色板的图像（IVIS机器）。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。