2023年5月17日 18:31:41go评论143阅读模式

英文:

Creating Pairs by Group and Keep the Groups ids

问题

以下是您要求的翻译部分：

从这个数据框中：

data<-data.frame(id_group=c("A","A","B","B","B","C","C", "C"),id_entity=c(1,2,2,3,4,2,5,1),nb_members=c(2,2,3,3,3,3,3,3))

使用以下代码，我成功获得了成对的连接：

m <- crossprod(table(data[-3]))
m[upper.tri(m, diag = TRUE)] <- 0
t_data<-subset(as.data.frame.table(m))
t_data <- t_data%>%subset(id_entity.1 != id_entity)

但是，我想保留连接所在的组ID信息：

id_entity id_entity.1 Freq   id_group
         2           1    2    A,C
         3           1    0    NA
         4           1    0    NA
         5           1    1    C
         1           2    0    NA
         3           2    1    B
         4           2    1    B
         5           2    1    C
         1           3    0    NA
         2           3    0    NA
         4           3    1    B
         5           3    0    NA
         1           4    0    NA
         2           4    0    NA
         3           4    0    NA
         5           4    0    NA
         1           5    0    NA
         2           5    0    NA
         3           5    0    NA
         4           5    0    NA

非常感谢您的帮助！

英文:

Here below an example of what I would like:

From this data frame:

data&lt;-data.frame(id_group=c(&quot;A&quot;,&quot;A&quot;,&quot;B&quot;,&quot;B&quot;,&quot;B&quot;,&quot;C&quot;,&quot;C&quot;, &quot;C&quot;),id_entity=c(1,2,2,3,4,2,5,1),nb_members=c(2,2,3,3,3,3,3,3))


  id_group id_entity nb_members
        A         1          2
        A         2          2
        B         2          3
        B         3          3
        B         4          3
        C         2          3
        C         5          3
        C         1          3

With the following code I manage to obtain the connections by pairs:

m &lt;- crossprod(table(data[-3]))
m[upper.tri(m, diag = TRUE)] &lt;-0
t_data&lt;-subset(as.data.frame.table(m))
t_data &lt;- t_data%&gt;%subset(id_entity.1 != id_entity)

id_entity id_entity.1 Freq
         2           1    2
         3           1    0
         4           1    0
         5           1    1
         1           2    0
         3           2    1
         4           2    1
         5           2    1
         1           3    0
         2           3    0
         4           3    1
         5           3    0
         1           4    0
         2           4    0
         3           4    0
         5           4    0
         1           5    0
         2           5    0
         3           5    0
         4           5    0

However, I would like to keep the information about the groups ids in which the connections are made:

id_entity id_entity.1 Freq   id_group
         2           1    2    A,C
         3           1    0    NA
         4           1    0    NA
         5           1    1    C
         1           2    0    NA
         3           2    1    B
         4           2    1    B
         5           2    1    C
         1           3    0    NA
         2           3    0    NA
         4           3    1    B
         5           3    0    NA
         1           4    0    NA
         2           4    0    NA
         3           4    0    NA
         5           4    0    NA
         1           5    0    NA
         2           5    0    NA
         3           5    0    NA
         4           5    0    NA

Thank you very much for your help!

答案1

得分: 1

以下是您要翻译的内容：

我通过创建一个表格，该表格镜像了在crossprod()中使用的表格，但在频率表中的非零值处有字母来完成这个任务。然后，您可以使用id_identity和id_identity.1的信息来查找字母表的适当列。您希望从这两列的交叉值中拼接出结果。当频率计数为零时，您可以将字母值替换为NA。

library(dplyr)
d <- data.frame(id_group = c("A", "A", "B", "B", "B", "C", "C", "C"), id_entity = c(1, 2, 2, 3, 4, 2, 5, 1), nb_members = c(2, 2, 3, 3, 3, 3, 3, 3))

tab <- table(d[-3])
tab2 <- apply(tab, 2, function(x) ifelse(x == 1, rownames(tab), ""))
m <- crossprod(table(d[-3]))
m[upper.tri(m, diag = TRUE)] <- 0
t_data <- as.data.frame.table(m)
t_data <- t_data %>% subset(id_entity.1 != id_entity)

t_data$pairs <- apply(t_data, 1, function(x) paste(intersect(tab2[, x[1]], tab2[, x[2]]), collapse = ","))
t_data$pairs <- gsub("^\\,", "", t_data$pairs)
t_data$pairs <- ifelse(t_data$Freq == 0, NA, t_data$pairs)
t_data
#>    id_entity id_entity.1 Freq pairs
#> 2          2           1    2   A,C
#> 3          3           1    0  <NA>
#> 4          4           1    0  <NA>
#> 5          5           1    1     C
#> 6          1           2    0  <NA>
#> 8          3           2    1     B
#> 9          4           2    1     B
#> 10         5           2    1     C
#> 11         1           3    0  <NA>
#> 12         2           3    0  <NA>
#> 14         4           3    1     B
#> 15         5           3    0  <NA>
#> 16         1           4    0  <NA>
#> 17         2           4    0  <NA>
#> 18         3           4    0  <NA>
#> 20         5           4    0  <NA>
#> 21         1           5    0  <NA>
#> 22         2           5    0  <NA>
#> 23         3           5    0  <NA>
#> 24         4           5    0  <NA>

^{创建于2023年5月17日，使用reprex v2.0.2}

英文:

I accomplished this by making a table that mirrors the table used in crossprod(), but that has letters where there are non-zero values in the table of frequencies. Then, you can use information for id_identity and id_identity.1 to find the appropriate columns of the letter table. You want to past together the intersecting values from those two columns. You can replace the letter values with NA when the frequency count is zero.

library(dplyr)
d&lt;-data.frame(id_group=c(&quot;A&quot;,&quot;A&quot;,&quot;B&quot;,&quot;B&quot;,&quot;B&quot;,&quot;C&quot;,&quot;C&quot;, &quot;C&quot;),id_entity=c(1,2,2,3,4,2,5,1),nb_members=c(2,2,3,3,3,3,3,3))

tab &lt;- table(d[-3])
tab2 &lt;- apply(tab, 2, function(x)ifelse(x == 1, rownames(tab), &quot;&quot;))
m &lt;- crossprod(table(d[-3]))
m[upper.tri(m, diag = TRUE)] &lt;-0
t_data&lt;-as.data.frame.table(m)
t_data &lt;- t_data%&gt;%subset(id_entity.1 != id_entity)

t_data$pairs &lt;- apply(t_data, 1, function(x)paste(intersect(tab2[,x[1]], tab2[,x[2]]), collapse=&quot;,&quot;))
t_data$pairs &lt;- gsub(&quot;^\\,&quot;, &quot;&quot;, t_data$pairs)
t_data$pairs &lt;- ifelse(t_data$Freq == 0, NA, t_data$pairs)
t_data
#&gt;    id_entity id_entity.1 Freq pairs
#&gt; 2          2           1    2   A,C
#&gt; 3          3           1    0  &lt;NA&gt;
#&gt; 4          4           1    0  &lt;NA&gt;
#&gt; 5          5           1    1     C
#&gt; 6          1           2    0  &lt;NA&gt;
#&gt; 8          3           2    1     B
#&gt; 9          4           2    1     B
#&gt; 10         5           2    1     C
#&gt; 11         1           3    0  &lt;NA&gt;
#&gt; 12         2           3    0  &lt;NA&gt;
#&gt; 14         4           3    1     B
#&gt; 15         5           3    0  &lt;NA&gt;
#&gt; 16         1           4    0  &lt;NA&gt;
#&gt; 17         2           4    0  &lt;NA&gt;
#&gt; 18         3           4    0  &lt;NA&gt;
#&gt; 20         5           4    0  &lt;NA&gt;
#&gt; 21         1           5    0  &lt;NA&gt;
#&gt; 22         2           5    0  &lt;NA&gt;
#&gt; 23         3           5    0  &lt;NA&gt;
#&gt; 24         4           5    0  &lt;NA&gt;

<sup>Created on 2023-05-17 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

创建成对的分组并保留分组ID。

问题

答案1

Subtracting values of a shared variable between two data frames with unequal size if their categorical variables are the same

根据条件在R中累积求和时添加0

Is it possible to delete the first few row of xlsx files (over 100 files) with multiple sheets in r?

获取每年的最高、最低和平均值，放入一张表中。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论