英文:
Calculate the row-wise weighted sum for a set of columns
问题
以下是代码的翻译部分:
我有如下的数据框:
> library(tidyverse)
> dd <- tibble(a = rep(1,10), b = rep(1,10), c = rep(1,10))
> dd
# A tibble: 10 × 3
a b c
<dbl> <dbl> <dbl>
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
5 1 1 1
6 1 1 1
7 1 1 1
8 1 1 1
9 1 1 1
10 1 1 1
以及一个权重向量:
> weight <- c(1, 5, 10)
> weight
[1] 1 5 10
当我想要计算数据框的所有列的加权行和时,我执行以下操作:
> dd %>% mutate(m = rowSums(map2_dfc(dd, weight,`*`)))
# A tibble: 10 × 4
a b c m
<dbl> <dbl> <dbl> <dbl>
1 1 1 1 16
2 1 1 1 16
3 1 1 1 16
4 1 1 1 16
5 1 1 1 16
6 1 1 1 16
7 1 1 1 16
8 1 1 1 16
9 1 1 1 16
10 1 1 1 16
但是我不知道如何计算数据框的**子集**的加权行和。我尝试了下面的代码,但结果混乱不堪:
> dd %>% rowwise() %>% mutate(m = rowwise(map2_dfc(c_across(b:c), weight[2:3],`*`)))
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
# A tibble: 10 × 4
# Rowwise:
a b c m$...1 $...2
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 5 10
2 1 1 1 5 10
3 1 1 1 5 10
4 1 1 1 5 10
5 1 1 1 5 10
6 1 1 1 5 10
7 1 1 1 5 10
8 1 1 1 5 10
9 1 1 1 5 10
10 1 1 1 5 10
请问有人可以给我一些关于如何解决这个问题的提示吗?
英文:
I have, say, the following data frame:
> library(tidyverse)
> dd <- tibble(a = rep(1,10), b = rep(1,10), c = rep(1,10))
> dd
# A tibble: 10 × 3
a b c
<dbl> <dbl> <dbl>
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
5 1 1 1
6 1 1 1
7 1 1 1
8 1 1 1
9 1 1 1
10 1 1 1
and a vector of weights:
> weight <- c(1, 5, 10)
> weight
[1] 1 5 10
when I want to calculate the row-wise weighted sum for all the columns of the dataframe together, I do this:
> dd %>% mutate(m = rowSums(map2_dfc(dd, weight,`*`)))
# A tibble: 10 × 4
a b c m
<dbl> <dbl> <dbl> <dbl>
1 1 1 1 16
2 1 1 1 16
3 1 1 1 16
4 1 1 1 16
5 1 1 1 16
6 1 1 1 16
7 1 1 1 16
8 1 1 1 16
9 1 1 1 16
10 1 1 1 16
but I don't know how to calculate the row-wise weighted sum for a subset of the data frame. I tried the code below, but it gives messy results:
> dd %>% rowwise() %>% mutate(m = rowwise(map2_dfc(c_across(b:c), weight[2:3],`*`)))
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
# A tibble: 10 × 4
# Rowwise:
a b c m$...1 $...2
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 5 10
2 1 1 1 5 10
3 1 1 1 5 10
4 1 1 1 5 10
5 1 1 1 5 10
6 1 1 1 5 10
7 1 1 1 5 10
8 1 1 1 5 10
9 1 1 1 5 10
10 1 1 1 5 10
Can someone please give me a hint as to how to approach this problem?
答案1
得分: 4
这是矩阵相乘。您的原始代码等同于 as.matrix(dd) %*% weight
。在 mutate
内部的子集中,您可以这样做:
dd %>% mutate(m = (across(b:c) %>% as.matrix()) %*% weight[1:2])
英文:
This is matrix multiplication. Your original is equivalent to as.matrix(dd) %*% weight
. For a subset inside mutate
you can do this:
dd %>% mutate(m = (across(b:c) %>% as.matrix()) %*% weight[1:2])
答案2
得分: 2
使用 tidyverse
方法,我们可以创建一个命名向量用于 'weight',通过列 'b' 到 'c' 进行循环 across
,根据列名(cur_column()
)来选择 'weight' 值,进行乘法并获得 rowSums
。
library(dplyr)
names(weight) <- names(dd)
dd %>%
mutate(m = rowSums(across(b:c, ~ .x * weight[cur_column()])))
-output
# A tibble: 10 × 4
a b c m
<dbl> <dbl> <dbl> <dbl>
1 1 1 1 15
2 1 1 1 15
3 1 1 1 15
4 1 1 1 15
5 1 1 1 15
6 1 1 1 15
7 1 1 1 15
8 1 1 1 15
9 1 1 1 15
10 1 1 1 15
或者如果我们想要使用 rowwise
(不推荐,因为它速度较慢)
dd %>%
rowwise %>%
mutate(m = sum(c_across(b:c) * weight[2:3])) %>%
ungroup
或者使用 crossprod
dd %>%
mutate(m = crossprod(t(pick(b:c)), weight[2:3])[,1])
或者使用 base R
dd$m <- rowSums(dd[2:3] * weight[2:3][col(dd[2:3])])
英文:
Using tidyverse
methods, we can create a named vector for 'weight', loop across
the columns 'b' to 'c', subset the 'weight' value based on the column name (cur_column()
), multiply and get the rowSums
library(dplyr)
names(weight) <- names(dd)
dd %>%
mutate(m = rowSums(across(b:c, ~ .x * weight[cur_column()])))
-output
# A tibble: 10 × 4
a b c m
<dbl> <dbl> <dbl> <dbl>
1 1 1 1 15
2 1 1 1 15
3 1 1 1 15
4 1 1 1 15
5 1 1 1 15
6 1 1 1 15
7 1 1 1 15
8 1 1 1 15
9 1 1 1 15
10 1 1 1 15
Or if we want to use rowwise
(not recommended as it is slower)
dd %>%
rowwise %>%
mutate(m = sum(c_across(b:c) * weight[2:3])) %>%
ungroup
Or use crossprod
dd %>%
mutate(m = crossprod(t(pick(b:c)), weight[2:3])[,1])
Or with base R
dd$m <- rowSums(dd[2:3] * weight[2:3][col(dd[2:3])])
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论