计算R中列表的各元素的特定向量的平均值,并转换为data.frame。

huangapple go评论68阅读模式
英文:

Calculate the average of specific vectors of various elements of a list() in R and convert to data.frame

问题

df_mean <- data.frame(x = c(2, 3.333, 1.8))

英文:

I have a very large list() which has +2000 elements, where each element has two vectors (x and y) with different sizes between the elements of the list.

Example:

new_list&lt;-list(data.frame(x = c(1,2,3),
                          y = c(3,4,5)),
               data.frame(x = c(3,2,2,2,3,8),
                          y = c(5,2,3,5,6,7)),
               data.frame(x = c(3,2,2,1,1),
                          y = c(5,2,3,3,2)))

I would like to average only the x vectors in this list to get something like this:

df_mean&lt;-data.frame(x=c(2,3.333,1.8))

答案1

得分: 5

你可以使用sapply来计算每列带有"x"的colMeans,像这样:

data.frame(x = sapply(new_list, \(x) colMeans(x[grepl('x', names(x))])))
#>          x
#> 1 2.000000
#> 2 3.333333
#> 3 1.800000

@nicola建议了一个更好的选项,如下(谢谢!):

data.frame(x = sapply(new_list, \(x) mean(x$x)))
#>          x
#> 1 2.000000
#> 2 3.333333
#> 3 1.800000

创建于2023-02-06,使用reprex v2.0.2

英文:

You could calculate the colMeans per column that has x with sapply like this:

data.frame(x = sapply(new_list, \(x) colMeans(x[grepl(&#39;x&#39;, names(x))])))
#&gt;          x
#&gt; 1 2.000000
#&gt; 2 3.333333
#&gt; 3 1.800000

@nicola suggested a better option like this (thanks!):

data.frame(x = sapply(new_list, \(x) mean(x$x)))
#&gt;          x
#&gt; 1 2.000000
#&gt; 2 3.333333
#&gt; 3 1.800000

<sup>Created on 2023-02-06 with reprex v2.0.2</sup>

答案2

得分: 2

Using map

library(purrr)
library(dplyr)
map_dfr(new_list, ~ .x %>% 
    summarise(x = mean(x)))
         x
1 2.000000
2 3.333333
3 1.800000
英文:

Using map

library(purrr)
library(dplyr)
map_dfr(new_list, ~ .x %&gt;% 
    summarise(x = mean(x)))
         x
1 2.000000
2 3.333333
3 1.800000

</details>



# 答案3
**得分**: 2

好的,以下是代码部分的翻译:

```R
良好的回答由Quinten提供。我通常更喜欢遵循KISS原则。以下是我发现语法上更简单的格式:

    len <- length(new_list)
    sapply(1:len, function(z) mean(new_list[[z]][[1]]))
    [1] 2.000000 3.333333 1.800000
英文:

Good answer by Quinten. I usually prefer to follow the KISS principle. Here is a format that I find syntactically simpler:

len &lt;- length(new_list)
sapply(1:len, function(z) mean(new_list[[z]][[1]]))
[1] 2.000000 3.333333 1.800000

答案4

得分: 2

在这个相对简单的情况下,我认为我更喜欢@quinten建议的更为简洁的解决方案。但是,如果你需要在嵌套数据框上计算更多的统计信息,你可以考虑类似这样的方法:

library(tidyverse)

tibble(data = new_list) |&gt; 
  rowwise() |&gt; 
  summarise(
    x = mean(data$x)
  )

或者另外一种方法:

tibble(data = new_list) |&gt; 
  rowwise() |&gt; 
  summarise(
    data |&gt; 
      summarise(x = mean(x))
  )
英文:

In this relatively simple case, I think I would prefer the more concise solution suggested by @quinten. However, if you need to calculate more statistics on the nested data frames, you could consider something like this:

library(tidyverse)

tibble(data = new_list) |&gt; 
  rowwise() |&gt; 
  summarise(
    x = mean(data$x)
  )
#&gt; # A tibble: 3 &#215; 1
#&gt;       x
#&gt;   &lt;dbl&gt;
#&gt; 1  2   
#&gt; 2  3.33
#&gt; 3  1.8

or alternatively

tibble(data = new_list) |&gt; 
  rowwise() |&gt; 
  summarise(
    data |&gt; 
      summarise(x = mean(x))
  )
#&gt; # A tibble: 3 &#215; 1
#&gt;       x
#&gt;   &lt;dbl&gt;
#&gt; 1  2   
#&gt; 2  3.33
#&gt; 3  1.8

答案5

得分: 1

你还可以使用enframe函数将列表转换为数据框,并按组进行均值计算:

library(dplyr) #1.1.0或更高版本
library(tibble)
enframe(new_list) %>%
  unnest(value) %>%
  summarise(x = mean(x), .by = name)

#   name     x
#1     1  2   
#2     2  3.33
#3     3  1.8 
英文:

You can also enframe the list and do a mean by group:

library(dplyr) #1.1.0 or higher
library(tibble)
enframe(new_list) %&gt;% 
  unnest(value) %&gt;% 
  summarise(x = mean(x), .by = name)

#   name     x
#1     1  2   
#2     2  3.33
#3     3  1.8 

答案6

得分: 0

使用data.tablerbindlist

data.table::rbindlist(new_list, idcol = TRUE)[, .(x = mean(x)), .id][, 2]
#>           x
#> 1: 2.000000
#> 2: 3.333333
#> 3: 1.800000
英文:

Using data.table's rbindlist:

data.table::rbindlist(new_list, idcol = TRUE)[, .(x = mean(x)), .id][, 2]
#&gt;           x
#&gt; 1: 2.000000
#&gt; 2: 3.333333
#&gt; 3: 1.800000

huangapple
  • 本文由 发表于 2023年2月7日 00:51:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/75364264.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定