2023年2月24日 08:59:49go评论110阅读模式

英文:

Select rows from R table based on two columns in another table

问题

我有两个表格：table1和table2，其中table1比table2大得多，但table2不完全包含在table1中。每个表格中还有两个ID列 - ID1和ID2。我想获取table1和table2中两个ID列匹配的行。如果一个ID的配对在一个表格中而不在另一个表格中，那么这一行就不应返回。

我尝试了以下代码：t1[which(t1$ID1 == t2$ID1 & t1$ID2 == t2$ID2，但它显示“长对象长度不是短对象长度的倍数”。有什么想法吗？

英文:

I have two tables; table1 and table2, where table1 is much bigger than table2, but table2 is not fully contained in table1. I also have two ID columns - ID1 and ID2 in each table. I want to obtain the rows in table1 and table 2 in which the two ID columns coincide. If a pairing of ID's is in one table and not the other then the row should not be returned.

I tried t1[which(t1$ID1 == t2$ID1 & t1$ID2 == t2$ID2

It said that the longer object length is not a multiple of shorter object length. Any ideas?

答案1

得分: 3

使用 dplyr::semi_join()（并借用 @thesixmax 的示例数据）：

library(dplyr)
table1 %>%
  semi_join(table2, by = c("ID_1", "ID_2"))
#   ID_1 ID_2 val
# 1  0_2  1_2   2
# 2  0_4  1_4   4
table2 %>%
  semi_join(table1, by = c("ID_1", "ID_2"))
#   ID_1 ID_2 val
# 1  0_2  1_2   1
# 2  0_4  1_4   2

英文:

With dplyr::semi_join() (and borrowing @thesixmax’s example data):

library(dplyr)
table1 %&gt;%
  semi_join(table2, by = c(&quot;ID_1&quot;, &quot;ID_2&quot;))
#   ID_1 ID_2 val
# 1  0_2  1_2   2
# 2  0_4  1_4   4
table2 %&gt;%
  semi_join(table1, by = c(&quot;ID_1&quot;, &quot;ID_2&quot;))
#   ID_1 ID_2 val
# 1  0_2  1_2   1
# 2  0_4  1_4   2

答案2

得分: 2

简单的示例：

table1 <- data.frame(
  "ID_1" = c("0_1", "0_2", "0_3", "0_4", "0_5"),
  "ID_2" = c("1_1", "1_2", "1_3", "1_4", "1_5"),
  val = c(1, 2, 3, 4, 5)
)
table2 <- data.frame(
  "ID_1" = c("0_2", "0_4", "0_6", "0_7", "0_8", "0_9", "0_10"),
  "ID_2" = c("1_2", "1_4", "1_6", "1_7", "1_8", "1_9", "1_10"),
  val = c(1, 2, 3, 4, 5, 6, 7)
)
使用基本的R解决方案：
ids1 <- which(interaction(table1[,c("ID_1", "ID_2")]) %in% 
               interaction(table2[,c("ID_1", "ID_2")]))
ids2 <- which(interaction(table2[,c("ID_1", "ID_2")]) %in%
                interaction(table1[,c("ID_1", "ID_2")]))
overlap1 <- table1[ids1,]
overlap2 <- table2[ids2,]

英文:

Simple repex:

table1 &lt;- data.frame(
  &quot;ID_1&quot; = c(&quot;0_1&quot;, &quot;0_2&quot;, &quot;0_3&quot;, &quot;0_4&quot;, &quot;0_5&quot;),
  &quot;ID_2&quot; = c(&quot;1_1&quot;, &quot;1_2&quot;, &quot;1_3&quot;, &quot;1_4&quot;, &quot;1_5&quot;),
  val = c(1, 2, 3, 4, 5)
)
table2 &lt;- data.frame(
  &quot;ID_1&quot; = c(&quot;0_2&quot;, &quot;0_4&quot;, &quot;0_6&quot;, &quot;0_7&quot;, &quot;0_8&quot;, &quot;0_9&quot;, &quot;0_10&quot;),
  &quot;ID_2&quot; = c(&quot;1_2&quot;, &quot;1_4&quot;, &quot;1_6&quot;, &quot;1_7&quot;, &quot;1_8&quot;, &quot;1_9&quot;, &quot;1_10&quot;),
  val = c(1, 2, 3, 4, 5, 6, 7)
)

A solution using base R:

ids1 &lt;- which(interaction(table1[,c(&quot;ID_1&quot;, &quot;ID_2&quot;)]) %in% 
               interaction(table2[,c(&quot;ID_1&quot;, &quot;ID_2&quot;)]))
ids2 &lt;- which(interaction(table2[,c(&quot;ID_1&quot;, &quot;ID_2&quot;)]) %in%
                interaction(table1[,c(&quot;ID_1&quot;, &quot;ID_2&quot;)]))
overlap1 &lt;- table1[ids1,]
overlap2 &lt;- table2[ids2,]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据另一张表中的两列选择R表中的行。

问题

答案1

答案2

高效地找到最后一个连续的1序列中的第一个1。

如何在ggplot个人函数之间传递变量？

在R中按重复日期绑定或合并行。

Robust F Test with PLM regression in R

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。