英文:
Ho to apply the same formula to groups
问题
我的数据集unitatsconsum_2021
如下:
structure(list(NUMERO = structure(c(21, 22, 22, 22, 23, 23, 23,
24, 24, 25, 25, 25, 25, 26, 27, 28), format.stata = "%12.0g"),
unitats_consum = c(2, 2, 2, 2, 2, 2, 1.9, 1.5, 1.5, 2.5,
2.5, 2.5, 2.2, 1, 1, 2), edat = c(17, 51, 17, 14, 44, 36,
3, 67, 63, 35, 48, 17, 13, 73, 67, 73), membresllar = c(3L,
3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 4L, 4L, 4L, 4L, 1L, 1L, 3L
)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -16L), groups = structure(list(NUMERO = structure(c(21,
22, 23, 24, 25, 26, 27, 28), format.stata = "%12.0g"), .rows = structure(list(
1L, 2:4, 5:7, 8:9, 10:13, 14L, 15L, 16L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row names = c(NA, -8L), .drop = TRUE))
我想计算一个新变量unitats_consum
,其计算方式为:1 + 0.5 * (如果edat
> 13的观察次数 - 1) + 0.3 * (如果edat
>= 13的观察次数)。
这个方程的结果应该对于相同的NUMERO
相同,也就是标识符。到目前为止,我尝试了以下操作:
Unitatsconsum_2021 <- Unitatsconsum_2021 %>%
group_by(NUMERO) %>%
mutate(unitats_consum = (1 +
0.5 * (ifelse(edat > 13, membresllar - 1, 0)) +
0.3 * (ifelse(edat >= 13, membresllar, 0))))
期望的输出如下:
因此,在代码中,membres_llar
应该分别计算edat
> 13和edat
>= 13的观察次数。
英文:
My dataset, unitatsconsum_2021
is such:
structure(list(NUMERO = structure(c(21, 22, 22, 22, 23, 23, 23,
24, 24, 25, 25, 25, 25, 26, 27, 28), format.stata = "%12.0g"),
unitats_consum = c(2, 2, 2, 2, 2, 2, 1.9, 1.5, 1.5, 2.5,
2.5, 2.5, 2.2, 1, 1, 2), edat = c(17, 51, 17, 14, 44, 36,
3, 67, 63, 35, 48, 17, 13, 73, 67, 73), membresllar = c(3L,
3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 4L, 4L, 4L, 4L, 1L, 1L, 3L
)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -16L), groups = structure(list(NUMERO = structure(c(21,
22, 23, 24, 25, 26, 27, 28), format.stata = "%12.0g"), .rows = structure(list(
1L, 2:4, 5:7, 8:9, 10:13, 14L, 15L, 16L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -8L), .drop = TRUE))
I want to calculate a new variable, unitats_consum
, which should be equal to: 1 + 0.5*((observations if edat>13)-1) + 0.3*(observations if edat>=13).
The result of this equation should be the same for each identical NUMERO
, which is the identifier. So far I have tried the following:
Unitatsconsum_2021 <- Unitatsconsum_2021 %>%
group_by(NUMERO) %>%
mutate(unitats_consum = (1 +
0.5 * (ifelse(edat > 13, membresllar - 1, 0)) +
0.3 * (ifelse(edat <= 13, membresllar, 0))))
The desired output is:
So, in the code, membres_llar
should count the number of observations where edat
> 13 and where edat
>=13, in each case respectively.
答案1
得分: 1
这与你的两行输出不匹配,但我相信这是你要找的:
Unitatsconsum_2021 <- Unitatsconsum_2021 %>%
group_by(NUMERO) %>%
mutate(
unitats_consum = 1 + 0.5 * (sum(edat > 13) - 1) + 0.3 * sum(edat <= 13)
)
对于NUMERO为21的情况,我们应该得到1,因为1 + 0.5 * (1 - 1) = 1,NUMERO为28的情况也是一样。
英文:
This does not match your output for two rows, but I believe it is what you are looking for:
Unitatsconsum_2021 <- Unitatsconsum_2021 %>%
group_by(NUMERO) %>%
mutate(
unitats_consum = 1 + 0.5 * (sum(edat > 13) - 1) + 0.3 * sum(edat <= 13)
)
Unitatsconsum_2021
# # A tibble: 16 × 4
# # Groups: NUMERO [8]
# NUMERO unitats_consum edat membresllar
# <dbl> <dbl> <dbl> <int>
# 1 21 1 17 3
# 2 22 2 51 3
# 3 22 2 17 3
# 4 22 2 14 3
# 5 23 1.8 44 3
# 6 23 1.8 36 3
# 7 23 1.8 3 3
# 8 24 1.5 67 2
# 9 24 1.5 63 2
# 10 25 2.3 35 4
# 11 25 2.3 48 4
# 12 25 2.3 17 4
# 13 25 2.3 13 4
# 14 26 1 73 1
# 15 27 1 67 1
# 16 28 1 73 3
For NUMERO 21, we should have 1, since 1 + 0.5 * (1 - 1) = 1 and the same for NUMERO 28.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论