英文:
Combine matrix row / column names in R
问题
I have multiple matrices reflecting bipartite / affiliation networks at different time points. These matrices have a lot of overlap in their incumbents, but also a lot of differences. For further analysis, however, I need them to be the same dimensions and have the same actors per row/column, so I need to combine row and column names somehow.
The final matrices will be around 8000 times 200, but each individual matrix is around 2000 times 150. Here is an example of two matrices and how I want the result to look like:
adj1 <- matrix(0, 3, 5)
colnames(adj1) <- c("g1", "g2", "g3", "g5", "g6")
rownames(adj1) <- c("Tim", "John", "Sarah")
adj2 <- matrix(0, 4, 2)
colnames(adj2) <- c("g1", "g4")
rownames(adj2) <- c("Tim", "Mary", "John", "Paolo")
combined_adj <- matrix(0,5,6)
colnames(combined_adj) <- c("g1","g2","g3","g4","g5","g6")
rownames(combined_adj) <- c("John","Mary","Paolo","Sarah","Tim")
Ideally, the new cells should read "NA" or "10" and rows and columns would be ordered alphabetically. The initial values in each matrix need to be kept. I am at a loss of what to do here and appreciate any help!
英文:
I have multiple matrices reflecting bipartite / affiliation networks at different time points. These matrices have a lot of overlap in their incumbents, but also a lot of differences. For further analysis, however, I need them to be the same dimensions and have the same actors per row/column, so I need to combine row and column names somehow.
The final matrices will be around 8000 times 200, but each individual matrix is around 2000 times 150. Here is an example of two matrices and how I want the result to look like:
adj1 <- matrix(0, 3, 5)
colnames(adj1) <- c("g1", "g2", "g3", "g5", "g6")
rownames(adj1) <- c("Tim", "John", "Sarah")
adj2 <- matrix(0, 4, 2)
colnames(adj2) <- c("g1", "g4")
rownames(adj2) <- c("Tim", "Mary", "John", "Paolo")
combined_adj <- matrix(0,5,6)
colnames(combined_adj) <- c("g1","g2","g3","g4","g5","g6")
rownames(combined_adj) <- c("John","Mary","Paolo","Sarah","Tim")
Ideally, the new cells should read "NA" or "10" and rows and columns would be ordered alphabetically. The initial values in each matrix need to be kept. I am at a loss of what to do here and appreciate any help!
答案1
得分: 3
你可以使用merge
并指定你想要使用row.names
来进行合并。
combined_adj <- merge(x = adj1,
y = adj2,
by = c('row.names',
intersect(colnames(adj1),
colnames(adj2))
),
all = TRUE
)
combined_adj
Row.names g1 g2 g3 g5 g6 g4
1 John 0 0 0 0 0 0
2 Mary 0 NA NA NA NA 0
3 Paolo 0 NA NA NA NA 0
4 Sarah 0 0 0 0 0 NA
5 Tim 0 0 0 0 0 0
这将其转换为一个数据框,如果需要,你需要将其转换回矩阵。
row.names(combined_adj) <- combined_adj[,1]
combined_adj <- combined_adj[,-1]
编辑:合并多个矩阵
我们使用Reduce
来应用它到所有矩阵上。但是首先需要转换为数据框,并创建一个包含row_names
的列以简化操作。
# 创建示例数据
adj1 <- matrix(
0, 3, 5,
dimnames = list(c("Tim", "John", "Sarah"),
c("g1", "g2", "g3", "g5", "g6"))
)
adj2 <- matrix(
0, 4, 2,
dimnames = list(c("Tim", "Mary", "John", "Paolo"),
c("g1", "g4"))
)
adj3 <- matrix(
0, 3, 3,
dimnames = list(c("Tim2", "Mary2", "John"), c("g1", "g4", 'g7'))
)
# 创建一个列表
list_matrices <- list(adj1, adj2, adj3)
# 转换为数据框并创建包含row.names的列
list_matrices <- lapply(list_matrices, function(mat){
mat <- as.data.frame(mat)
mat$row_names <- row.names(mat)
mat
})
# 依次组合它们,首先合并1和2,然后将结果与3合并,以此类推
res <- Reduce(function(mat1, mat2) merge(mat1, mat2, all = TRUE), x = list_matrices)
res
g1 row_names g4 g2 g3 g5 g6 g7
1 0 John 0 0 0 0 0 0
2 0 Mary 0 NA NA NA NA NA
3 0 Mary2 0 NA NA NA NA 0
4 0 Paolo 0 NA NA NA NA NA
5 0 Sarah NA 0 0 0 0 NA
6 0 Tim 0 0 0 0 0 NA
7 0 Tim2 0 NA NA NA NA 0
希望这些翻译对你有帮助。
英文:
You can use merge and specify that you want to use row.names
for merging as well.
combined_adj <- merge(x = adj1,
y = adj2,
by = c('row.names',
intersect(colnames(adj1),
colnames(adj2))
),
all = TRUE
)
combined_adj
Row.names g1 g2 g3 g5 g6 g4
1 John 0 0 0 0 0 0
2 Mary 0 NA NA NA NA 0
3 Paolo 0 NA NA NA NA 0
4 Sarah 0 0 0 0 0 NA
5 Tim 0 0 0 0 0 0
This turns it into a data.frame, so you will need to convert it back to a matrix if required.
row.names(combined_adj) <- combined_adj[,1]
combined_adj <- combined_adj[,-1]
Edit: Merge multiple matrices
We use Reduce
to apply it over all matrices. We first convert to data.frame however and create a column with row_names to simplify things.
# create sample data
adj1 <- matrix(
0, 3, 5,
dimnames = list(c("Tim", "John", "Sarah"),
c("g1", "g2", "g3", "g5", "g6"))
)
adj2 <- matrix(
0, 4, 2,
dimnames = list(c("Tim", "Mary", "John", "Paolo"),
c("g1", "g4"))
)
adj3 <- matrix(
0, 3, 3,
dimnames = list(c("Tim2", "Mary2", "John"), c("g1", "g4", 'g7'))
)
# create a list
list_matrices <- list(adj1, adj2, adj3)
# convert to dataframes and create a column with row.names
list_matrices <- lapply(list_matrices, function(mat){
mat <- as.data.frame(mat)
mat$row_names <- row.names(mat)
mat
})
# successively combine them, merge 1..2 and then merge result with 3 and so on
res <- Reduce(function(mat1, mat2) merge(mat1, mat2, all = TRUE), x = list_matrices)
res
g1 row_names g4 g2 g3 g5 g6 g7
1 0 John 0 0 0 0 0 0
2 0 Mary 0 NA NA NA NA NA
3 0 Mary2 0 NA NA NA NA 0
4 0 Paolo 0 NA NA NA NA NA
5 0 Sarah NA 0 0 0 0 NA
6 0 Tim 0 0 0 0 0 NA
7 0 Tim2 0 NA NA NA NA 0
答案2
得分: 1
这可能是一个解决方案。但是,我假设这些单元格中存在的信息对于相同的行名称和列名称组合始终相同。此外,它依赖于 dplyr
:
require(tidyverse)
list_adj <- list(
adj1, adj2
)
df.adj <- NULL
for (adj in list_adj) {
df.adj.temp <- adj %>% as_tibble(rownames = "row_names")
if (is.null(df.adj)) {
df.adj <- df.adj.temp
} else {
c.colnames.join.by <- c(intersect(colnames(df.adj), colnames(df.adj.temp)))
df.adj <- df.adj %>%
full_join(df.adj.temp, by = c.colnames.join.by) %>%
mutate(across(.cols = - row_names, .fns = \(x) replace_na(x, 10)))
}
}
df.adj %>%
arrange(row_names) %>% # ordering rows
select(all_of(sort(colnames(df.adj)))) %>% # ordering columns
column_to_rownames(var = "row_names") %>%
as.matrix()
输出
g1 g2 g3 g5 g6 g4
John 0 0 0 0 0 0
Mary 0 10 10 10 10 0
Paolo 0 10 10 10 10 0
Sarah 0 0 0 0 0 10
Tim 0 0 0 0 0 0
<details>
<summary>英文:</summary>
This could be one solution. However, I am assuming that the information that does exist in these cells is always the same for the same combination of row name and column name. In addition to this, it relies on `dplyr`:
require(tidyverse)
list_adj <- list(
adj1, adj2
)
df.adj <- NULL
for (adj in list_adj) {
df.adj.temp <- adj %>% as_tibble(rownames = "row_names")
if (is.null(df.adj)) {
df.adj <- df.adj.temp
} else {
c.colnames.join.by <- c(intersect(colnames(df.adj), colnames(df.adj.temp)))
df.adj <- df.adj %>%
full_join(df.adj.temp, by = c.colnames.join.by) %>%
mutate(across(.cols = - row_names, .fns = \(x) replace_na(x, 10)))
}
}
df.adj %>%
arrange(row_names) %>% # ordering rows
select(all_of(sort(colnames(df.adj)))) %>% # ordering columns
column_to_rownames(var = "row_names") %>%
as.matrix()
# output
g1 g2 g3 g5 g6 g4
John 0 0 0 0 0 0
Mary 0 10 10 10 10 0
Paolo 0 10 10 10 10 0
Sarah 0 0 0 0 0 10
Tim 0 0 0 0 0 0
</details>
# 答案3
**得分**: 0
以下是使用基本的R选项`reshape`进行的翻译:
```R
df <- unique(
rbind(
as.data.frame(as.table(adj1)),
as.data.frame(as.table(adj2))
)
)
reshape(
df,
direction = "wide",
idvar = "Var1",
timevar = "Var2"
)
得到的结果如下:
Var1 Freq.g1 Freq.g2 Freq.g3 Freq.g5 Freq.g6 Freq.g4
1 Tim 0 0 0 0 0 0
2 John 0 0 0 0 0 0
3 Sarah 0 0 0 0 0 NA
17 Mary 0 NA NA NA NA 0
19 Paolo 0 NA NA NA NA 0
或者,我们使用xtabs
的方法:
mat <- `class<-`(xtabs(Freq ~ ., df) * NA, "matrix")
mat[as.matrix(df[-3])] <- df$Freq
得到的结果如下:
> mat
Var2
Var1 g1 g2 g3 g5 g6 g4
Tim 0 0 0 0 0 0
John 0 0 0 0 0 0
Sarah 0 0 0 0 0 NA
Mary 0 NA NA NA NA 0
Paolo 0 NA NA NA NA 0
attr(,"call")
xtabs(formula = Freq ~ ., data = df)
英文:
Here is a base R option with reshape
df <- unique(
rbind(
as.data.frame(as.table(adj1)),
as.data.frame(as.table(adj2))
)
)
reshape(
df,
direction = "wide",
idvar = "Var1",
timevar = "Var2"
)
which gives
Var1 Freq.g1 Freq.g2 Freq.g3 Freq.g5 Freq.g6 Freq.g4
1 Tim 0 0 0 0 0 0
2 John 0 0 0 0 0 0
3 Sarah 0 0 0 0 0 NA
17 Mary 0 NA NA NA NA 0
19 Paolo 0 NA NA NA NA 0
Or, we use xtabs
mat <- `class<-`(xtabs(Freq ~ ., df) * NA, "matrix")
mat[as.matrix(df[-3])] <- df$Freq
which gives
> mat
Var2
Var1 g1 g2 g3 g5 g6 g4
Tim 0 0 0 0 0 0
John 0 0 0 0 0 0
Sarah 0 0 0 0 0 NA
Mary 0 NA NA NA NA 0
Paolo 0 NA NA NA NA 0
attr(,"call")
xtabs(formula = Freq ~ ., data = df)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论