如何将数据框列表合并为一个数据框使用R?

huangapple go评论86阅读模式
英文:

How to combine a list of data frame into a single dataframe using R?

问题

我有一个数据帧列表

A1 = data.frame(name = c("a1", "a3", "a5"), cor = c(1, 0.99, 0.93))
A2 = data.frame(name = c("a2", "a3", "a4"), cor = c(1, 0.94, 0.94))
A3 = data.frame(name = c("a3", "a1", "a2", "a6"), cor = c(1, 0.99, 0.94, 0.91))
myList = list(A1, A2, A3)

每个数据框都是计算得出的相关系数(CC)。

例如:

A1中,a1和a1之间的CC为1,a1和a3之间的CC为0.99,a1和a5之间的CC为0.93;

A2中,a2和a2之间的CC为1,a2和a3之间的CC为0.94,a2和a4之间的CC为0.94。

我想要做的是将这些单独的数据框合并成一个完整的,如下所示:

corMatrix
     a1   a2   a3   a4   a5   a6
a1 1.00 0.00 0.99 0.00 0.93 0.00
a2 0.00 1.00 0.94 0.94 0.00 0.00
a3 0.99 0.94 1.00 0.00 0.00 0.91
a4 0.00 0.94 0.00 1.00 0.00 0.00
a5 0.93 0.00 0.00 0.00 1.00 0.00
a6 0.00 0.00 0.91 0.00 0.00 1.00

这个corMatrix数据框包含了上述数据框的所有相关信息。如果两个变量的相关信息未知,则使用0表示它们的CC值,例如变量a1和a2

我该如何做?

非常感谢。

英文:

I have a list of data frame

A1 = data.frame(name = c("a1", "a3", "a5"), cor = c(1, 0.99, 0.93))
A2 = data.frame(name = c("a2", "a3", "a4"), cor = c(1, 0.94, 0.94))
A3 = data.frame(name = c("a3", "a1", "a2", "a6"), cor = c(1, 0.99, 0.94, 0.91))
myList = list(A1, A2, A3)

Each data frame is a calculated correlation coefficient (CC).

For instance:

in A1, the CC between a1 and a1 is 1, between a1 and a3 is 0.99, and between a1 and a5 is 0.93;

in A2, the CC between a2 and a2 is 1, between a2 and a3 is 0.94, and between a2 and a4 is 0.94.

What I want to do is to combine these individual dataframe into a complete one like following:

corMatrix
     a1   a2   a3   a4   a5   a6
a1 1.00 0.00 0.99 0.00 0.93 0.00
a2 0.00 1.00 0.94 0.94 0.00 0.00
a3 0.99 0.94 1.00 0.00 0.00 0.91
a4 0.00 0.94 0.00 1.00 0.00 0.00
a5 0.93 0.00 0.00 0.00 1.00 0.00
a6 0.00 0.00 0.91 0.00 0.00 1.00

This corMatrix dataframe contains all the correlation information of the above data frames. If the correlation information of two variables are unknown, then 0 is used to represent their CC values, such as variable a1 and a2.

How can I do it?

Thanks a lot.

答案1

得分: 2

我相信这是您寻找的,尽管可能不是最佳的方法:

A1 = data.frame(name = c("a1", "a3", "a5"), cor = c(1, 0.99, 0.93))
A2 = data.frame(name = c("a2", "a3", "a4"), cor = c(1, 0.94, 0.94))
A3 = data.frame(name = c("a3", "a1", "a2", "a6"), cor = c(1, 0.99, 0.94, 0.91))
myList = list(A1, A2, A3)

names(myList) = c("a1", "a2", "a3")
myMatrix = dplyr::bind_rows(myList, .id = "name2") |>
  dplyr::mutate(name2 = factor(name2, levels = c("a1", "a2", "a3", "a4", "a5", "a6")),
                name = factor(name, levels = c("a1", "a2", "a3", "a4", "a5", "a6"))) |>
  tidyr::complete(name2, name, fill = list(cor = 0)) |>
  tidyr::pivot_wider(names_from = name2, values_from = cor) |>
  tibble::column_to_rownames("name") |>
  as.matrix() 
diag(myMatrix) <- 1
myMatrix[upper.tri(myMatrix)] <- t(myMatrix)[upper.tri(myMatrix)]

它返回:

     a1   a2   a3   a4   a5   a6
a1 1.00 0.00 0.99 0.00 0.93 0.00
a2 0.00 1.00 0.94 0.94 0.00 0.00
a3 0.99 0.94 1.00 0.00 0.00 0.91
a4 0.00 0.94 0.00 1.00 0.00 0.00
a5 0.93 0.00 0.00 0.00 1.00 0.00
a6 0.00 0.00 0.91 0.00 0.00 1.00

一般思路是:

  • 给列表命名,以确保您知道它们是哪些相关性(如果列表较长,可以使用paste()来自动生成名称)
  • 将所有列表元素合并成一个数据框
  • 使用因子来填充所有可能的元素(如果需要,可以以编程方式完成)
  • 使用0填充缺失值
  • 切换到矩阵,对角线加1,并使对角线上下对称。
英文:

I believe this does what you're looking for, although it may not be the best way of doing this:

A1 = data.frame(name = c(&quot;a1&quot;, &quot;a3&quot;, &quot;a5&quot;), cor = c(1, 0.99, 0.93))
A2 = data.frame(name = c(&quot;a2&quot;, &quot;a3&quot;, &quot;a4&quot;), cor = c(1, 0.94, 0.94))
A3 = data.frame(name = c(&quot;a3&quot;, &quot;a1&quot;, &quot;a2&quot;, &quot;a6&quot;), cor = c(1, 0.99, 0.94, 0.91))
myList = list(A1, A2, A3)

names(myList) = c(&quot;a1&quot;, &quot;a2&quot;, &quot;a3&quot;)
myMatrix = dplyr::bind_rows(myList, .id = &quot;name2&quot;) |&gt; 
  dplyr::mutate(name2 = factor(name2, levels = c(&quot;a1&quot;, &quot;a2&quot;, &quot;a3&quot;, &quot;a4&quot;, &quot;a5&quot;, &quot;a6&quot;)),
                name = factor(name, levels = c(&quot;a1&quot;, &quot;a2&quot;, &quot;a3&quot;, &quot;a4&quot;, &quot;a5&quot;, &quot;a6&quot;))) |&gt; 
  tidyr::complete(name2, name, fill = list(cor = 0)) |&gt; 
  tidyr::pivot_wider(names_from = name2, values_from = cor) |&gt; 
  tibble::column_to_rownames(&quot;name&quot;) |&gt; 
  as.matrix() 
diag(myMatrix) &lt;- 1
myMatrix[upper.tri(myMatrix)] &lt;- t(myMatrix)[upper.tri(myMatrix)]

which returns:

     a1   a2   a3   a4   a5   a6
a1 1.00 0.00 0.99 0.00 0.93 0.00
a2 0.00 1.00 0.94 0.94 0.00 0.00
a3 0.99 0.94 1.00 0.00 0.00 0.91
a4 0.00 0.94 0.00 1.00 0.00 0.00
a5 0.93 0.00 0.00 0.00 1.00 0.00
a6 0.00 0.00 0.91 0.00 0.00 1.00

The general idea is that you:

  • name the list to make sure you know which correlations they are (could do this programmatically with paste() if longer list)
  • combine all the list elements together into a dataframe
  • fill out all possible elements using factors (again could be done programmatically if required)
  • complete to add 0 for missing values
  • switch to a matrix, add 1 for diagonal, and make symmetric across the diagonal

答案2

得分: 2

Here's the translated code:

在基础R中,你可以这样做:

a <- do.call(rbind, Map(cbind, name1 = c('a1','a2', 'a3'), myList))
b <- unique(rbind(a, setNames(a[c(2,1,3)], names(a))))
xtabs(cor~., b)

And the table:

     name
name1   a1   a2   a3   a4   a5   a6
   a1 1.00 0.00 0.99 0.00 0.93 0.00
   a2 0.00 1.00 0.94 0.94 0.00 0.00
   a3 0.99 0.94 1.00 0.00 0.00 0.91
   a4 0.00 0.94 0.00 0.00 0.00 0.00
   a5 0.93 0.00 0.00 0.00 0.00 0.00
   a6 0.00 0.00 0.91 0.00 0.00 0.00
英文:

in Base R you could do:

a &lt;- do.call(rbind,Map(cbind, name1 = c(&#39;a1&#39;,&#39;a2&#39;, &#39;a3&#39;), myList))
b &lt;- unique(rbind(a, setNames(a[c(2,1,3)], names(a))))
xtabs(cor~., b)

    name
name1   a1   a2   a3   a4   a5   a6
   a1 1.00 0.00 0.99 0.00 0.93 0.00
   a2 0.00 1.00 0.94 0.94 0.00 0.00
   a3 0.99 0.94 1.00 0.00 0.00 0.91
   a4 0.00 0.94 0.00 0.00 0.00 0.00
   a5 0.93 0.00 0.00 0.00 0.00 0.00
   a6 0.00 0.00 0.91 0.00 0.00 0.00

huangapple
  • 本文由 发表于 2023年4月4日 04:38:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/75923595.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定