2023年5月6日 18:11:35go评论122阅读模式

英文:

Combine matrix row / column names in R

问题

I have multiple matrices reflecting bipartite / affiliation networks at different time points. These matrices have a lot of overlap in their incumbents, but also a lot of differences. For further analysis, however, I need them to be the same dimensions and have the same actors per row/column, so I need to combine row and column names somehow.

The final matrices will be around 8000 times 200, but each individual matrix is around 2000 times 150. Here is an example of two matrices and how I want the result to look like:

adj1 <- matrix(0, 3, 5)
colnames(adj1) <- c("g1", "g2", "g3", "g5", "g6")
rownames(adj1) <- c("Tim", "John", "Sarah")
adj2 <- matrix(0, 4, 2)
colnames(adj2) <- c("g1", "g4")
rownames(adj2) <- c("Tim", "Mary", "John", "Paolo")
combined_adj <- matrix(0,5,6)
colnames(combined_adj) <- c("g1","g2","g3","g4","g5","g6")
rownames(combined_adj) <- c("John","Mary","Paolo","Sarah","Tim")

Ideally, the new cells should read "NA" or "10" and rows and columns would be ordered alphabetically. The initial values in each matrix need to be kept. I am at a loss of what to do here and appreciate any help!

英文:

The final matrices will be around 8000 times 200, but each individual matrix is around 2000 times 150. Here is an example of two matrices and how I want the result to look like:

adj1 &lt;- matrix(0, 3, 5)
colnames(adj1) &lt;- c(&quot;g1&quot;, &quot;g2&quot;, &quot;g3&quot;, &quot;g5&quot;, &quot;g6&quot;)
rownames(adj1) &lt;- c(&quot;Tim&quot;, &quot;John&quot;, &quot;Sarah&quot;)
adj2 &lt;- matrix(0, 4, 2)
colnames(adj2) &lt;- c(&quot;g1&quot;, &quot;g4&quot;)
rownames(adj2) &lt;- c(&quot;Tim&quot;, &quot;Mary&quot;, &quot;John&quot;, &quot;Paolo&quot;)
combined_adj &lt;- matrix(0,5,6)
colnames(combined_adj) &lt;- c(&quot;g1&quot;,&quot;g2&quot;,&quot;g3&quot;,&quot;g4&quot;,&quot;g5&quot;,&quot;g6&quot;)
rownames(combined_adj) &lt;- c(&quot;John&quot;,&quot;Mary&quot;,&quot;Paolo&quot;,&quot;Sarah&quot;,&quot;Tim&quot;)

答案1

得分: 3

你可以使用merge并指定你想要使用row.names来进行合并。

combined_adj <- merge(x = adj1,
      y = adj2,
      by = c('row.names', 
             intersect(colnames(adj1), 
                       colnames(adj2))
             ), 
      all = TRUE
)
combined_adj
  Row.names g1 g2 g3 g5 g6 g4
1      John  0  0  0  0  0  0
2      Mary  0 NA NA NA NA  0
3     Paolo  0 NA NA NA NA  0
4     Sarah  0  0  0  0  0 NA
5       Tim  0  0  0  0  0  0

这将其转换为一个数据框，如果需要，你需要将其转换回矩阵。

row.names(combined_adj) <- combined_adj[,1]
combined_adj <- combined_adj[,-1]

编辑：合并多个矩阵

我们使用Reduce来应用它到所有矩阵上。但是首先需要转换为数据框，并创建一个包含row_names的列以简化操作。

# 创建示例数据
adj1 <- matrix(
  0, 3, 5,
  dimnames = list(c("Tim", "John", "Sarah"), 
                  c("g1", "g2", "g3", "g5", "g6"))
)
adj2 <- matrix(
  0, 4, 2, 
  dimnames = list(c("Tim", "Mary", "John", "Paolo"),
                  c("g1", "g4"))
)
adj3 <- matrix(
  0, 3, 3, 
  dimnames = list(c("Tim2", "Mary2", "John"), c("g1", "g4", 'g7'))
)
# 创建一个列表 
list_matrices <- list(adj1, adj2, adj3)
# 转换为数据框并创建包含row.names的列
list_matrices <- lapply(list_matrices, function(mat){
  mat <- as.data.frame(mat)
  mat$row_names <- row.names(mat)
  mat
})
# 依次组合它们，首先合并1和2，然后将结果与3合并，以此类推
res <- Reduce(function(mat1, mat2) merge(mat1, mat2, all = TRUE), x = list_matrices)
res
  g1 row_names g4 g2 g3 g5 g6 g7
1  0      John  0  0  0  0  0  0
2  0      Mary  0 NA NA NA NA NA
3  0     Mary2  0 NA NA NA NA  0
4  0     Paolo  0 NA NA NA NA NA
5  0     Sarah NA  0  0  0  0 NA
6  0       Tim  0  0  0  0  0 NA
7  0      Tim2  0 NA NA NA NA  0

希望这些翻译对你有帮助。

英文:

You can use merge and specify that you want to use row.names for merging as well.

combined_adj &lt;- merge(x = adj1,
      y = adj2,
      by = c(&#39;row.names&#39;, 
             intersect(colnames(adj1), 
                       colnames(adj2))
             ), 
      all = TRUE
)
combined_adj
  Row.names g1 g2 g3 g5 g6 g4
1      John  0  0  0  0  0  0
2      Mary  0 NA NA NA NA  0
3     Paolo  0 NA NA NA NA  0
4     Sarah  0  0  0  0  0 NA
5       Tim  0  0  0  0  0  0

This turns it into a data.frame, so you will need to convert it back to a matrix if required.

row.names(combined_adj) &lt;- combined_adj[,1]
combined_adj &lt;- combined_adj[,-1]

Edit: Merge multiple matrices

We use Reduce to apply it over all matrices. We first convert to data.frame however and create a column with row_names to simplify things.

# create sample data
adj1 &lt;- matrix(
  0, 3, 5,
  dimnames = list(c(&quot;Tim&quot;, &quot;John&quot;, &quot;Sarah&quot;), 
                  c(&quot;g1&quot;, &quot;g2&quot;, &quot;g3&quot;, &quot;g5&quot;, &quot;g6&quot;))
)
adj2 &lt;- matrix(
  0, 4, 2, 
  dimnames = list(c(&quot;Tim&quot;, &quot;Mary&quot;, &quot;John&quot;, &quot;Paolo&quot;),
                  c(&quot;g1&quot;, &quot;g4&quot;))
)
adj3 &lt;- matrix(
  0, 3, 3, 
  dimnames = list(c(&quot;Tim2&quot;, &quot;Mary2&quot;, &quot;John&quot;), c(&quot;g1&quot;, &quot;g4&quot;, &#39;g7&#39;))
)
# create a list 
list_matrices &lt;- list(adj1, adj2, adj3)
# convert to dataframes and create a column with row.names
list_matrices &lt;- lapply(list_matrices, function(mat){
  mat &lt;- as.data.frame(mat)
  mat$row_names &lt;- row.names(mat)
  mat
})
# successively combine them, merge 1..2 and then merge result with 3 and so on
res &lt;- Reduce(function(mat1, mat2) merge(mat1, mat2, all = TRUE), x = list_matrices)
res
  g1 row_names g4 g2 g3 g5 g6 g7
1  0      John  0  0  0  0  0  0
2  0      Mary  0 NA NA NA NA NA
3  0     Mary2  0 NA NA NA NA  0
4  0     Paolo  0 NA NA NA NA NA
5  0     Sarah NA  0  0  0  0 NA
6  0       Tim  0  0  0  0  0 NA
7  0      Tim2  0 NA NA NA NA  0

答案2

得分: 1

这可能是一个解决方案。但是，我假设这些单元格中存在的信息对于相同的行名称和列名称组合始终相同。此外，它依赖于 dplyr：

require(tidyverse)
list_adj <- list(
  adj1, adj2
)
df.adj <- NULL
for (adj in list_adj) {
  df.adj.temp <- adj %>% as_tibble(rownames = "row_names")
  
  if (is.null(df.adj)) {
    df.adj <- df.adj.temp
  } else {
    c.colnames.join.by <- c(intersect(colnames(df.adj), colnames(df.adj.temp)))
    
    df.adj <- df.adj %>% 
      full_join(df.adj.temp, by = c.colnames.join.by) %>%
      mutate(across(.cols = - row_names, .fns = \(x) replace_na(x, 10)))
  }
}
df.adj %>% 
  arrange(row_names) %>% # ordering rows
  select(all_of(sort(colnames(df.adj)))) %>% # ordering columns
  column_to_rownames(var = "row_names") %>% 
  as.matrix()

输出

  g1 g2 g3 g5 g6 g4

John 0 0 0 0 0 0
Mary 0 10 10 10 10 0
Paolo 0 10 10 10 10 0
Sarah 0 0 0 0 0 10
Tim 0 0 0 0 0 0


<details>
<summary>英文:</summary>
This could be one solution. However, I am assuming that the information that does exist in these cells is always the same for the same combination of row name and column name. In addition to this, it relies on `dplyr`:
    require(tidyverse)
    list_adj &lt;- list(
      adj1, adj2
    )
    
    df.adj &lt;- NULL
    
    for (adj in list_adj) {
      df.adj.temp &lt;- adj %&gt;% as_tibble(rownames = &quot;row_names&quot;)
      
      if (is.null(df.adj)) {
        df.adj &lt;- df.adj.temp
      } else {
        c.colnames.join.by &lt;- c(intersect(colnames(df.adj), colnames(df.adj.temp)))
        
        df.adj &lt;- df.adj %&gt;% 
          full_join(df.adj.temp, by = c.colnames.join.by) %&gt;% 
          mutate(across(.cols = - row_names, .fns = \(x) replace_na(x, 10)))
      }
    }
    
    df.adj %&gt;% 
      arrange(row_names) %&gt;% # ordering rows
      select(all_of(sort(colnames(df.adj)))) %&gt;% # ordering columns
      column_to_rownames(var = &quot;row_names&quot;) %&gt;% 
      as.matrix()
    
    # output
          g1 g2 g3 g5 g6 g4
    John   0  0  0  0  0  0
    Mary   0 10 10 10 10  0
    Paolo  0 10 10 10 10  0
    Sarah  0  0  0  0  0 10
    Tim    0  0  0  0  0  0
</details>
# 答案3
**得分**: 0
以下是使用基本的R选项`reshape`进行的翻译：
```R
df <- unique(
    rbind(
        as.data.frame(as.table(adj1)),
        as.data.frame(as.table(adj2))
    )
)
reshape(
    df,
    direction = "wide",
    idvar = "Var1",
    timevar = "Var2"
)

得到的结果如下：

    Var1 Freq.g1 Freq.g2 Freq.g3 Freq.g5 Freq.g6 Freq.g4
1    Tim       0       0       0       0       0       0
2   John       0       0       0       0       0       0
3  Sarah       0       0       0       0       0      NA
17  Mary       0      NA      NA      NA      NA       0
19 Paolo       0      NA      NA      NA      NA       0

或者，我们使用xtabs的方法：

mat <- `class<-`(xtabs(Freq ~ ., df) * NA, "matrix")
mat[as.matrix(df[-3])] <- df$Freq

得到的结果如下：

> mat
       Var2
Var1    g1 g2 g3 g5 g6 g4
  Tim    0  0  0  0  0  0
  John   0  0  0  0  0  0
  Sarah  0  0  0  0  0 NA
  Mary   0 NA NA NA NA  0
  Paolo  0 NA NA NA NA  0
attr(,"call")
xtabs(formula = Freq ~ ., data = df)

英文:

Here is a base R option with reshape

df &lt;- unique(
    rbind(
        as.data.frame(as.table(adj1)),
        as.data.frame(as.table(adj2))
    )
)
reshape(
    df,
    direction = &quot;wide&quot;,
    idvar = &quot;Var1&quot;,
    timevar = &quot;Var2&quot;
)

which gives

    Var1 Freq.g1 Freq.g2 Freq.g3 Freq.g5 Freq.g6 Freq.g4
1    Tim       0       0       0       0       0       0
2   John       0       0       0       0       0       0
3  Sarah       0       0       0       0       0      NA
17  Mary       0      NA      NA      NA      NA       0
19 Paolo       0      NA      NA      NA      NA       0

Or, we use xtabs

mat &lt;- `class&lt;-`(xtabs(Freq ~ ., df) * NA, &quot;matrix&quot;)
mat[as.matrix(df[-3])] &lt;- df$Freq

which gives

&gt; mat
       Var2
Var1    g1 g2 g3 g5 g6 g4
  Tim    0  0  0  0  0  0
  John   0  0  0  0  0  0
  Sarah  0  0  0  0  0 NA
  Mary   0 NA NA NA NA  0
  Paolo  0 NA NA NA NA  0
attr(,&quot;call&quot;)
xtabs(formula = Freq ~ ., data = df)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中合并矩阵的行/列名称

问题

答案1

Edit: Merge multiple matrices

答案2

输出

如何从另一个包中有条件地为S3通用函数提供S3方法？

返回数据表中每个组的多行。

将数据重塑为长格式。

从数据框创建简单表格

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。