2020年1月3日 22:04:47go评论99阅读模式

英文:

Row-wise Boolean comparison of data

问题

我已经按适当的分组对数据进行了分组，我需要确保每个唯一的 Group1 和 Group2 组合下的 "x" 和 "y" 值相等。换句话说，我可以使用什么代码循环遍历这个数据集，并确保 A1x == A1y，A2x == A2y，等等。

以下是示例中的数据：

"Group1","Group2","group3","values"
"A"        "1"       x       10
"A"        "1"       y       10
"A"        "2"       x       15 
"A"        "2"       y       15

为了简化回答，以下是示例中的数据框：

d <- data.frame(Group1= c("A", "A", "A", "A"), 
                Group2= c("1", "1", "2", "2"), 
                group3= c("x", "y", "x", "y"), 
                values= c(10, 10, 15, 15))

英文:

I have grouped my data by the appropriate grouping, and I need to be sure that "x" and "y" values equal each other for each unique combination of Group1 and Group2. In other words, what code could I use to cycle through this dataset and ensure that A1x == A1y and A2x == A2y, etc.

&quot;Group1&quot;,&quot;Group2&quot;,&quot;group3&quot;,&quot;values&quot;
&quot;A&quot;        &quot;1&quot;       x       10
&quot;A&quot;        &quot;1&quot;       y       10
&quot;A&quot;        &quot;2&quot;       x       15 
&quot;A&quot;        &quot;2&quot;       y       15

To help make the answer easier, here is the data.frame from the example

    d &lt;- data.frame(Group1= c(&quot;A&quot;, &quot;A&quot;, &quot;A&quot;, &quot;A&quot;), 
                    Group2= c(&quot;1&quot;, &quot;1&quot;, &quot;2&quot;, &quot;2&quot;), 
                    group3= c(&quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;), 
                    values= c(10, 10, 15, 15))

答案1

得分: 4

使用dplyr，你可以做以下操作：

d %>%
  group_by(Group1, Group2) %>%
  mutate(cond = all(values == first(values)))

 Group1 Group2 group3 values cond 
 <fct>  <fct>  <fct>   <dbl> <lgl>
1 A      1      x          10 TRUE 
2 A      1      y          10 TRUE 
3 A      2      x          15 TRUE 
4 A      2      y          15 TRUE

或者：

d %>%
  group_by(Group1, Group2) %>%
  mutate(cond = n_distinct(values) == 1)

英文:

With dplyr, you can do:

d %&gt;%
 group_by(Group1, Group2) %&gt;%
 mutate(cond = all(values == first(values)))
  Group1 Group2 group3 values cond 
  &lt;fct&gt;  &lt;fct&gt;  &lt;fct&gt;   &lt;dbl&gt; &lt;lgl&gt;
1 A      1      x          10 TRUE 
2 A      1      y          10 TRUE 
3 A      2      x          15 TRUE 
4 A      2      y          15 TRUE

Or:

d %&gt;%
 group_by(Group1, Group2) %&gt;%
 mutate(cond = n_distinct(values) == 1)

答案2

得分: 3

你也可以使用 pivot_wider 完成这个操作：

tidyr::pivot_wider(d, names_from='group3', values_from='values') %>%
  dplyr::mutate(eq=x==y)

英文:

You can also do this with pivot_wider:

tidyr::pivot_wider(d, names_from=&#39;group3&#39;, values_from=&#39;values&#39;) %&gt;% 
  dplyr::mutate(eq=x==y)

答案3

得分: 1

我认为你在将数据转换为长格式方面走得太远，也许这样更容易操作：

d %>%
  pivot_wider(names_from = group3, values_from = values) %>%
  mutate(is_equal = x == y)

英文:

I think you went too far into turning your data into a long format maybe this is easier to manipulate

d %&gt;% 
  pivot_wider(names_from = group3,values_from = values) %&gt;% 
  mutate(is_equal = x == y)

答案4

得分: 1

以下是使用基本的 R 解决方案，使用 ave() 来实现的部分：

d <- within(d, isequal <- as.logical(ave(values, Group1, Group2, FUN = function(v) v == unique(v))))

这样，数据框 d 中的 isequal 列将如下所示：

> d
  Group1 Group2 group3 values isequal
1      A      1      x     10    TRUE
2      A      1      y     10    TRUE
3      A      2      x     15    TRUE
4      A      2      y     15    TRUE

请注意，这段代码使用 ave() 函数根据 Group1 和 Group2 列的组合来计算 values 列是否在组合内是唯一的，并将结果存储在 isequal 列中。

英文:

Here is a base R solution using ave() to make it

d &lt;- within(d,isequal &lt;- as.logical(ave(values,Group1,Group2,FUN = function(v) v==unique(v))))

such that

&gt; d
  Group1 Group2 group3 values isequal
1      A      1      x     10    TRUE
2      A      1      y     10    TRUE
3      A      2      x     15    TRUE
4      A      2      y     15    TRUE

答案5

得分: 0

另一种选项是，如果数据被正确分组并且每组有2行：

d$check <- rep(d$values[seq(1L, nrow(d), 2L)] == d$values[seq(2L, nrow(d), 2L)], each = 2L)

英文:

Another option if the data is grouped properly and has 2 rows for each group:

d$check &lt;- rep(d$values[seq(1L,nrow(d),2L)]==d$values[seq(2L,nrow(d),2L)], each=2L)

答案6

得分: -1

一个简单的方法是合并具有组x和组y的子表格以比较数值。

> d[d$group3=="y",]
#      Group1 Group2 group3 values
#    2      A      1      y     10
#    4      A      2      y     15
> merge(d[d$group3=="y",],d[d$group3=="x",],by=c("Group1","Group2"))
#  Group1 Group2 group3.x values.x group3.y values.y
#  1      A      1        y       10        x       10
#  2      A      2        y       15        x       15
 with(merge(d[d$group3=="y",], d[d$group3=="x",],
      by=c("Group1","Group2")),
      values.x==values.y)
 ## [1] TRUE TRUE
当然，你还有更高级的方法，但从简单开始并不是坏事。
<details>
<summary>英文:</summary>
A simple way would be to merge the sub tables with group x and group y to compare the values.
    &gt; d[d$group3==&quot;y&quot;,]
    #      Group1 Group2 group3 values
    #    2      A      1      y     10
    #    4      A      2      y     15
    &gt; merge(d[d$group3==&quot;y&quot;,],d[d$group3==&quot;x&quot;,],by=c(&quot;Group1&quot;,&quot;Group2&quot;))
    #  Group1 Group2 group3.x values.x group3.y values.y
    #  1      A      1        y       10        x       10
    #  2      A      2        y       15        x       15
     with(merge(d[d$group3==&quot;y&quot;,], d[d$group3==&quot;x&quot;,],
          by=c(&quot;Group1&quot;,&quot;Group2&quot;)),
          values.x==values.y)
     ## [1] TRUE TRUE
Of course you have fancier ways of doing it but it is not bad to start simple first
</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

数据的逐行布尔比较

问题

答案1

答案2

答案3

答案4

答案5

答案6

在模拟次数上绘制均值/方差

在Python中等价于R中的geosphere::distGeo的函数是：

使用提供的数据集，在R中如何创建一个分组条形图可视化。

如何根据R中的特定列号更改数据框的设计？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。