如何在每列中计算多个加权平均值?

huangapple go评论61阅读模式
英文:

How can I calculate multiple weighted means in each column?

问题

I am trying to find a way to calculate weighted means for multiple years for each country. My data is currently formatted like:

df <- data.frame(
  YEAR_CALENDAR=c(2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021, 2021),
  age_group=c("15-29", "30-44", "45-59", "60-74", "Over 75", "15-29", "30-44", "45-59", "60-74", "Over 75"),
  a=c(3.85, 3.66, 3.76, 2.70, 3.10, 4.32, 4.64, 3.67, 3.45, 4.56),
  b=c(3.56, 3.67, 3.72, 3.89, 4.23, 4.28, 4.27, 3.12, 3.46, 3.97),
  weights=rep(c(0.3333784699, 0.2890995261, 0.2161137441, 0.1203791469, 0.04150304671), 2))

I would like to have a data frame with the weighted average of Country a and Country b in each year, using the coefficients in the weights column. The resulting data frame would have columns for Year, Country a, Country b, with the values being the weighted average of that country in that year.

My approach was to multiply the weights column to the a column and b column. Then, I would try to add the values of 5 rows together (for each year) to find the weighted average. In the first step of multiplying the weights, I am getting a "non-numeric argument to binary operator" error. The code is below:

new_df <- df %>%
  mutate(across(3:4, .*df$weights))

I was in need of a fix for my approach, or for someone to help me get the results through a different method.

EDIT:
Okay, the multiplication part works! Thank you for the help. I was wondering if there was a way to add the values in each age_group based on the country and year, so that I have one value for each country in each year.

Many thanks!

英文:

strong textI am trying to find a way to calculate weighted means for multiple years for each country. My data is currently formatted like:

df &lt;- data.frame(
  YEAR_CALENDAR=c(2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021, 2021),
  age_group=c(&quot;15-29&quot;, &quot;30-44&quot;, &quot;45-59&quot;, &quot;60-74&quot;, &quot;Over 75&quot;, &quot;15-29&quot;, &quot;30-44&quot;, &quot;45-59&quot;, &quot;60-74&quot;, &quot;Over 75&quot;),
  a=c(3.85, 3.66, 3.76, 2.70, 3.10, 4.32, 4.64, 3.67, 3.45, 4.56),
  b=c(3.56, 3.67, 3.72, 3.89, 4.23, 4.28, 4.27, 3.12, 3.46, 3.97),
  weights=rep(c(0.3333784699, 0.2890995261, 0.2161137441,
                  0.1203791469, 0.04150304671), 2))

I would like to have a data frame with the weighted average of Country a and and Country b in each year, using the coefficients in the weights column. The resulting data frame would have columns for Year, Country a, Country b, with the values being the weighted average of that country in that year.

My approach was to multiply the weights column to the a column and b column. Then, I would try to add the values of 5 rows together (for each year) to find the weighted average. In the first step of multiplying the weights, I am getting a "non-numeric argument to binary operator" error. The code is below:

new_df &lt;- df %&gt;%
  mutate(across(3:4, .*df$weights))

I was in need of a fix for my approach, or for someone to help me get the results through a different method.

EDIT:
Okay, the multiplication part works! Thank you for the help. I was wondering if there was a way to add the values in each age_group based on the country and year, so that I have one value for each country in each year.

Many thanks!

答案1

得分: 0

你可以通过在 summarize 内部使用基础的 R weighted.mean 函数来简化操作

df %>% 
  group_by(YEAR_CALENDAR) %>% 
  summarise(across(a:b, ~ weighted.mean(.x, weights)))
#> # A tibble: 2 x 3
#>   YEAR_CALENDAR     a     b
#>           <dbl> <dbl> <dbl>
#> 1          2020  3.61  3.69
#> 2          2021  4.18  3.92
英文:

You can simplify things by using the base R weighted.mean function inside summarize

df %&gt;%
  group_by(YEAR_CALENDAR) %&gt;%
  summarise(across(a:b, ~ weighted.mean(.x, weights)))
#&gt; # A tibble: 2 x 3
#&gt;   YEAR_CALENDAR     a     b
#&gt;           &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
#&gt; 1          2020  3.61  3.69
#&gt; 2          2021  4.18  3.92

huangapple
  • 本文由 发表于 2023年3月7日 03:21:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/75654979.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定