英文:
Creating Pairs by Group and Keep the Groups ids
问题
以下是您要求的翻译部分:
从这个数据框中:
data<-data.frame(id_group=c("A","A","B","B","B","C","C", "C"),id_entity=c(1,2,2,3,4,2,5,1),nb_members=c(2,2,3,3,3,3,3,3))
使用以下代码,我成功获得了成对的连接:
m <- crossprod(table(data[-3]))
m[upper.tri(m, diag = TRUE)] <- 0
t_data<-subset(as.data.frame.table(m))
t_data <- t_data%>%subset(id_entity.1 != id_entity)
但是,我想保留连接所在的组ID信息:
id_entity id_entity.1 Freq id_group
2 1 2 A,C
3 1 0 NA
4 1 0 NA
5 1 1 C
1 2 0 NA
3 2 1 B
4 2 1 B
5 2 1 C
1 3 0 NA
2 3 0 NA
4 3 1 B
5 3 0 NA
1 4 0 NA
2 4 0 NA
3 4 0 NA
5 4 0 NA
1 5 0 NA
2 5 0 NA
3 5 0 NA
4 5 0 NA
非常感谢您的帮助!
英文:
Here below an example of what I would like:
From this data frame:
data<-data.frame(id_group=c("A","A","B","B","B","C","C", "C"),id_entity=c(1,2,2,3,4,2,5,1),nb_members=c(2,2,3,3,3,3,3,3))
id_group id_entity nb_members
A 1 2
A 2 2
B 2 3
B 3 3
B 4 3
C 2 3
C 5 3
C 1 3
With the following code I manage to obtain the connections by pairs:
m <- crossprod(table(data[-3]))
m[upper.tri(m, diag = TRUE)] <-0
t_data<-subset(as.data.frame.table(m))
t_data <- t_data%>%subset(id_entity.1 != id_entity)
id_entity id_entity.1 Freq
2 1 2
3 1 0
4 1 0
5 1 1
1 2 0
3 2 1
4 2 1
5 2 1
1 3 0
2 3 0
4 3 1
5 3 0
1 4 0
2 4 0
3 4 0
5 4 0
1 5 0
2 5 0
3 5 0
4 5 0
However, I would like to keep the information about the groups ids in which the connections are made:
id_entity id_entity.1 Freq id_group
2 1 2 A,C
3 1 0 NA
4 1 0 NA
5 1 1 C
1 2 0 NA
3 2 1 B
4 2 1 B
5 2 1 C
1 3 0 NA
2 3 0 NA
4 3 1 B
5 3 0 NA
1 4 0 NA
2 4 0 NA
3 4 0 NA
5 4 0 NA
1 5 0 NA
2 5 0 NA
3 5 0 NA
4 5 0 NA
Thank you very much for your help!
答案1
得分: 1
以下是您要翻译的内容:
我通过创建一个表格,该表格镜像了在crossprod()
中使用的表格,但在频率表中的非零值处有字母来完成这个任务。然后,您可以使用id_identity
和id_identity.1
的信息来查找字母表的适当列。您希望从这两列的交叉值中拼接出结果。当频率计数为零时,您可以将字母值替换为NA
。
library(dplyr)
d <- data.frame(id_group = c("A", "A", "B", "B", "B", "C", "C", "C"), id_entity = c(1, 2, 2, 3, 4, 2, 5, 1), nb_members = c(2, 2, 3, 3, 3, 3, 3, 3))
tab <- table(d[-3])
tab2 <- apply(tab, 2, function(x) ifelse(x == 1, rownames(tab), ""))
m <- crossprod(table(d[-3]))
m[upper.tri(m, diag = TRUE)] <- 0
t_data <- as.data.frame.table(m)
t_data <- t_data %>% subset(id_entity.1 != id_entity)
t_data$pairs <- apply(t_data, 1, function(x) paste(intersect(tab2[, x[1]], tab2[, x[2]]), collapse = ","))
t_data$pairs <- gsub("^\\,", "", t_data$pairs)
t_data$pairs <- ifelse(t_data$Freq == 0, NA, t_data$pairs)
t_data
#> id_entity id_entity.1 Freq pairs
#> 2 2 1 2 A,C
#> 3 3 1 0 <NA>
#> 4 4 1 0 <NA>
#> 5 5 1 1 C
#> 6 1 2 0 <NA>
#> 8 3 2 1 B
#> 9 4 2 1 B
#> 10 5 2 1 C
#> 11 1 3 0 <NA>
#> 12 2 3 0 <NA>
#> 14 4 3 1 B
#> 15 5 3 0 <NA>
#> 16 1 4 0 <NA>
#> 17 2 4 0 <NA>
#> 18 3 4 0 <NA>
#> 20 5 4 0 <NA>
#> 21 1 5 0 <NA>
#> 22 2 5 0 <NA>
#> 23 3 5 0 <NA>
#> 24 4 5 0 <NA>
创建于2023年5月17日,使用reprex v2.0.2
英文:
I accomplished this by making a table that mirrors the table used in crossprod()
, but that has letters where there are non-zero values in the table of frequencies. Then, you can use information for id_identity
and id_identity.1
to find the appropriate columns of the letter table. You want to past together the intersecting values from those two columns. You can replace the letter values with NA
when the frequency count is zero.
library(dplyr)
d<-data.frame(id_group=c("A","A","B","B","B","C","C", "C"),id_entity=c(1,2,2,3,4,2,5,1),nb_members=c(2,2,3,3,3,3,3,3))
tab <- table(d[-3])
tab2 <- apply(tab, 2, function(x)ifelse(x == 1, rownames(tab), ""))
m <- crossprod(table(d[-3]))
m[upper.tri(m, diag = TRUE)] <-0
t_data<-as.data.frame.table(m)
t_data <- t_data%>%subset(id_entity.1 != id_entity)
t_data$pairs <- apply(t_data, 1, function(x)paste(intersect(tab2[,x[1]], tab2[,x[2]]), collapse=","))
t_data$pairs <- gsub("^\\,", "", t_data$pairs)
t_data$pairs <- ifelse(t_data$Freq == 0, NA, t_data$pairs)
t_data
#> id_entity id_entity.1 Freq pairs
#> 2 2 1 2 A,C
#> 3 3 1 0 <NA>
#> 4 4 1 0 <NA>
#> 5 5 1 1 C
#> 6 1 2 0 <NA>
#> 8 3 2 1 B
#> 9 4 2 1 B
#> 10 5 2 1 C
#> 11 1 3 0 <NA>
#> 12 2 3 0 <NA>
#> 14 4 3 1 B
#> 15 5 3 0 <NA>
#> 16 1 4 0 <NA>
#> 17 2 4 0 <NA>
#> 18 3 4 0 <NA>
#> 20 5 4 0 <NA>
#> 21 1 5 0 <NA>
#> 22 2 5 0 <NA>
#> 23 3 5 0 <NA>
#> 24 4 5 0 <NA>
<sup>Created on 2023-05-17 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论