数据的逐行布尔比较

huangapple go评论99阅读模式
英文:

Row-wise Boolean comparison of data

问题

我已经按适当的分组对数据进行了分组,我需要确保每个唯一的 Group1 和 Group2 组合下的 "x" 和 "y" 值相等。换句话说,我可以使用什么代码循环遍历这个数据集,并确保 A1x == A1y,A2x == A2y,等等。

以下是示例中的数据:

  1. "Group1","Group2","group3","values"
  2. "A" "1" x 10
  3. "A" "1" y 10
  4. "A" "2" x 15
  5. "A" "2" y 15

为了简化回答,以下是示例中的数据框:

  1. d <- data.frame(Group1= c("A", "A", "A", "A"),
  2. Group2= c("1", "1", "2", "2"),
  3. group3= c("x", "y", "x", "y"),
  4. values= c(10, 10, 15, 15))
英文:

I have grouped my data by the appropriate grouping, and I need to be sure that "x" and "y" values equal each other for each unique combination of Group1 and Group2. In other words, what code could I use to cycle through this dataset and ensure that A1x == A1y and A2x == A2y, etc.

  1. &quot;Group1&quot;,&quot;Group2&quot;,&quot;group3&quot;,&quot;values&quot;
  2. &quot;A&quot; &quot;1&quot; x 10
  3. &quot;A&quot; &quot;1&quot; y 10
  4. &quot;A&quot; &quot;2&quot; x 15
  5. &quot;A&quot; &quot;2&quot; y 15

To help make the answer easier, here is the data.frame from the example

  1. d &lt;- data.frame(Group1= c(&quot;A&quot;, &quot;A&quot;, &quot;A&quot;, &quot;A&quot;),
  2. Group2= c(&quot;1&quot;, &quot;1&quot;, &quot;2&quot;, &quot;2&quot;),
  3. group3= c(&quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;),
  4. values= c(10, 10, 15, 15))

答案1

得分: 4

使用dplyr,你可以做以下操作:

  1. d %>%
  2. group_by(Group1, Group2) %>%
  3. mutate(cond = all(values == first(values)))
  1. Group1 Group2 group3 values cond
  2. <fct> <fct> <fct> <dbl> <lgl>
  3. 1 A 1 x 10 TRUE
  4. 2 A 1 y 10 TRUE
  5. 3 A 2 x 15 TRUE
  6. 4 A 2 y 15 TRUE

或者:

  1. d %>%
  2. group_by(Group1, Group2) %>%
  3. mutate(cond = n_distinct(values) == 1)
英文:

With dplyr, you can do:

  1. d %&gt;%
  2. group_by(Group1, Group2) %&gt;%
  3. mutate(cond = all(values == first(values)))
  4. Group1 Group2 group3 values cond
  5. &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;lgl&gt;
  6. 1 A 1 x 10 TRUE
  7. 2 A 1 y 10 TRUE
  8. 3 A 2 x 15 TRUE
  9. 4 A 2 y 15 TRUE

Or:

  1. d %&gt;%
  2. group_by(Group1, Group2) %&gt;%
  3. mutate(cond = n_distinct(values) == 1)

答案2

得分: 3

你也可以使用 pivot_wider 完成这个操作:

  1. tidyr::pivot_wider(d, names_from='group3', values_from='values') %>%
  2. dplyr::mutate(eq=x==y)
英文:

You can also do this with pivot_wider:

  1. tidyr::pivot_wider(d, names_from=&#39;group3&#39;, values_from=&#39;values&#39;) %&gt;%
  2. dplyr::mutate(eq=x==y)

答案3

得分: 1

我认为你在将数据转换为长格式方面走得太远,也许这样更容易操作:

  1. d %>%
  2. pivot_wider(names_from = group3, values_from = values) %>%
  3. mutate(is_equal = x == y)
英文:

I think you went too far into turning your data into a long format maybe this is easier to manipulate

  1. d %&gt;%
  2. pivot_wider(names_from = group3,values_from = values) %&gt;%
  3. mutate(is_equal = x == y)

答案4

得分: 1

以下是使用基本的 R 解决方案,使用 ave() 来实现的部分:

  1. d <- within(d, isequal <- as.logical(ave(values, Group1, Group2, FUN = function(v) v == unique(v))))

这样,数据框 d 中的 isequal 列将如下所示:

  1. > d
  2. Group1 Group2 group3 values isequal
  3. 1 A 1 x 10 TRUE
  4. 2 A 1 y 10 TRUE
  5. 3 A 2 x 15 TRUE
  6. 4 A 2 y 15 TRUE

请注意,这段代码使用 ave() 函数根据 Group1Group2 列的组合来计算 values 列是否在组合内是唯一的,并将结果存储在 isequal 列中。

英文:

Here is a base R solution using ave() to make it

  1. d &lt;- within(d,isequal &lt;- as.logical(ave(values,Group1,Group2,FUN = function(v) v==unique(v))))

such that

  1. &gt; d
  2. Group1 Group2 group3 values isequal
  3. 1 A 1 x 10 TRUE
  4. 2 A 1 y 10 TRUE
  5. 3 A 2 x 15 TRUE
  6. 4 A 2 y 15 TRUE

答案5

得分: 0

另一种选项是,如果数据被正确分组并且每组有2行:

  1. d$check <- rep(d$values[seq(1L, nrow(d), 2L)] == d$values[seq(2L, nrow(d), 2L)], each = 2L)
英文:

Another option if the data is grouped properly and has 2 rows for each group:

  1. d$check &lt;- rep(d$values[seq(1L,nrow(d),2L)]==d$values[seq(2L,nrow(d),2L)], each=2L)

答案6

得分: -1

一个简单的方法是合并具有组x和组y的子表格以比较数值。

  1. > d[d$group3=="y",]
  2. # Group1 Group2 group3 values
  3. # 2 A 1 y 10
  4. # 4 A 2 y 15
  5. > merge(d[d$group3=="y",],d[d$group3=="x",],by=c("Group1","Group2"))
  6. # Group1 Group2 group3.x values.x group3.y values.y
  7. # 1 A 1 y 10 x 10
  8. # 2 A 2 y 15 x 15
  9. with(merge(d[d$group3=="y",], d[d$group3=="x",],
  10. by=c("Group1","Group2")),
  11. values.x==values.y)
  12. ## [1] TRUE TRUE
  13. 当然,你还有更高级的方法,但从简单开始并不是坏事。
  14. <details>
  15. <summary>英文:</summary>
  16. A simple way would be to merge the sub tables with group x and group y to compare the values.
  17. &gt; d[d$group3==&quot;y&quot;,]
  18. # Group1 Group2 group3 values
  19. # 2 A 1 y 10
  20. # 4 A 2 y 15
  21. &gt; merge(d[d$group3==&quot;y&quot;,],d[d$group3==&quot;x&quot;,],by=c(&quot;Group1&quot;,&quot;Group2&quot;))
  22. # Group1 Group2 group3.x values.x group3.y values.y
  23. # 1 A 1 y 10 x 10
  24. # 2 A 2 y 15 x 15
  25. with(merge(d[d$group3==&quot;y&quot;,], d[d$group3==&quot;x&quot;,],
  26. by=c(&quot;Group1&quot;,&quot;Group2&quot;)),
  27. values.x==values.y)
  28. ## [1] TRUE TRUE
  29. Of course you have fancier ways of doing it but it is not bad to start simple first
  30. </details>

huangapple
  • 本文由 发表于 2020年1月3日 22:04:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/59579928.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定