英文:
Using accumulate and/or cumsum properly for projections
问题
我需要投影df1中每行的Z、W、Y值。为此,我将使用df2中的B的增长率。
df1:
X Y year Z W V
abc ab 2020 0.1 0.7 1.3
abc cd 2020 0.2 0.8 1.4
efg ef 2020 0.3 0.9 1.5
efg gh 2020 0.4 1 1.6
df2:
year B
2021 0.05
2022 0.063
2023 0.049
2024 0.061
2025 0.057
我理解可以通过交叉连接来解决这个问题。然而,在此之前,我需要确保前一个值用于下一次计算。
我期望得到类似这样的结果:
从df1中提取:
X Y year Z W V
abc ab 2020 0.1 0.7 1.3
df3: (已编辑)
year B C D E X Y
2020 0 0.1000 0.7000 1.3000 abc ab
2021 0.05 0.1050 0.7350 1.3650 abc ab
2022 0.063 0.1116 0.7813 1.4510 abc ab
2023 0.049 0.1171 0.8196 1.5221 abc ab
2024 0.061 0.1242 0.8696 1.6149 abc ab
2025 0.057 0.1313 0.9192 1.7070 abc ab
df3的公式 (已编辑)
1 year B C D E
2 2020 0 0.1 0.7 1.3
3 2021 0.05 =+C2*($B$3+1) =+D2*($B$3+1) =+E2*($B$3+1)
4 2022 0.063 =+C3*($B$4+1) =+D3*($B$4+1) =+E3*($B$4+1)
5 2023 0.049 =+C4*($B$5+1) =+D4*($B$5+1) =+E4*($B$5+1)
6 2024 0.061 =+C5*($B$6+1) =+D5*($B$6+1) =+E5*($B$6+1)
7 2025 0.057 =+C6*($B$7+1) =+D6*($B$7+1) =+E6*($B$7+1)
要获得df3,我尝试过以下方法:
df3 <- df2 %>%
mutate(example2 = accumulate(B, ~ 0.1 * ( .x +1)))
df3 <- df2 %>%
mutate(example2 = cumsum(accumulate(B, ~ 0.1 * ( .x +1))))
结果仍然不符合预期。
最后,我需要将其传递给交叉连接和跨函数,类似于以下方式:
df3 <- mutate(cross_join(df1[-3], df2), across(Z:V)*(H), H = NULL)
英文:
I need to project each Z, W, Y value of each row in df1. To do this, I will use the growth rates in B from df2.
df1:
X Y year Z W V
abc ab 2020 0.1 0.7 1.3
abc cd 2020 0.2 0.8 1.4
efg ef 2020 0.3 0.9 1.5
efg gh 2020 0.4 1 1.6
df2:
year B
2021 0.05
2022 0.063
2023 0.049
2024 0.061
2025 0.057
I understand that a cross join might solve the problem. However, before that I need to be sure that the precious value is used for the next calculation.
I expect something like this:
from df1:
X Y year Z W V
abc ab 2020 0.1 0.7 1.3
df3: (edited)
year B C D E X Y
2020 0 0.1000 0.7000 1.3000 abc ab
2021 0.05 0.1050 0.7350 1.3650 abc ab
2022 0.063 0.1116 0.7813 1.4510 abc ab
2023 0.049 0.1171 0.8196 1.5221 abc ab
2024 0.061 0.1242 0.8696 1.6149 abc ab
2025 0.057 0.1313 0.9192 1.7070 abc ab
Formulas for df3 (edited)
1 year B C D E
2 2020 0 0.1 0.7 1.3
3 2021 0.05 =+C2*($B$3+1) =+D2*($B$3+1) =+E2*($B$3+1)
4 2022 0.063 =+C3*($B$4+1) =+D3*($B$4+1) =+E3*($B$4+1)
5 2023 0.049 =+C4*($B$5+1) =+D4*($B$5+1) =+E4*($B$5+1)
6 2024 0.061 =+C5*($B$6+1) =+D5*($B$6+1) =+E5*($B$6+1)
7 2025 0.057 =+C6*($B$7+1) =+D6*($B$7+1) =+E6*($B$7+1)
To get df3, I have tried:
df3 <- df2 %>% mutate(example2 = accumulate(B, ~ 0.1 * ( .x +1)))
df3 <- df2 %>% mutate(example2 = cumsum(accumulate(B, ~ 0.1 * ( .x +1))))
The outcome is still not the expected.
At the end, I would need to pass it into the cross join and across functions, something like this:
df3 <- mutate(cross_join(df1[-3], df2), across(Z:V)*(H), H = NULL)
答案1
得分: 2
以下是翻译好的内容:
首先,我们对df2进行了一些操作。可能有更简单的方法(但不幸的是,总是有):
# 首先,我们对df2进行一些操作
# 1. 我们添加了一个名为2020的列
# 2. 我们将利率转换为1 + 利率(以使累积乘积函数起作用)
# 3. 我们使用cumprod函数将B列转换为B + 1的累积乘积
df2 <- df2 %>%
add_row(year = 2020, B = 0, .before = 1) %>%
mutate(B = cumprod(B + 1))
# 从df1中删除年份,因为它是不必要的,而且我们以后肯定还得删除它
df1 <- df1 %>%
select(-year)
# df3
cross_join(df1, df2) %>%
mutate(C = Z * B,
D = W * B,
E = V * B) %>%
select(year, B, C, D, E, X, Y)
# 一个tibble:24 × 7
year B C D E X Y
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 2020 1 0.1 0.7 1.3 abc ab
2 2021 1.05 0.105 0.735 1.37 abc ab
3 2022 1.12 0.112 0.781 1.45 abc ab
4 2023 1.17 0.117 0.820 1.52 abc ab
5 2024 1.24 0.124 0.870 1.61 abc ab
6 2025 1.31 0.131 0.919 1.71 abc ab
7 2020 1 0.2 0.8 1.4 abc cd
8 2021 1.05 0.21 0.84 1.47 abc cd
9 2022 1.12 0.223 0.893 1.56 abc cd
10 2023 1.17 0.234 0.937 1.64 abc cd
# ℹ 还有14行
更新:更简单的方法:
df2 %>%
add_row(year = 2020, B = 0, .before = 1) %>%
cross_join(df1) %>%
group_by(X, Y) %>%
mutate(across(c(Z, W, V), ~ . * cumprod(B + 1))) %>%
rename("C" = Z, "D" = W, "E" = V) %>%
ungroup()
希望这可以帮助您理解代码的内容。
英文:
Here is a first take on doing it. There's likely a simpler way (but alas, there always is):
# first, we do a few things to df2
# 1. we add a 2020 column
# 2. we transform the interest rate to 1 + the interest rate (to enable the cumulative product function to work
# 3. we use the cumprod function to turn the B column into the cumulative products of B + 1
df2 <- df2 %>%
add_row(year = 2020, B = 0, .before = 1) %>%
mutate(B = cumprod(B + 1))
# remove the year from df1, because it's unnecessary, and we'd have to remove it later anyway
df1 <- df1 %>%
select(-year)
# df3
cross_join(df1, df2) %>%
mutate(C = Z * B,
D = W * B,
E = V * B) %>%
select(year, B, C, D, E, X, Y)
# A tibble: 24 × 7
year B C D E X Y
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 2020 1 0.1 0.7 1.3 abc ab
2 2021 1.05 0.105 0.735 1.37 abc ab
3 2022 1.12 0.112 0.781 1.45 abc ab
4 2023 1.17 0.117 0.820 1.52 abc ab
5 2024 1.24 0.124 0.870 1.61 abc ab
6 2025 1.31 0.131 0.919 1.71 abc ab
7 2020 1 0.2 0.8 1.4 abc cd
8 2021 1.05 0.21 0.84 1.47 abc cd
9 2022 1.12 0.223 0.893 1.56 abc cd
10 2023 1.17 0.234 0.937 1.64 abc cd
# ℹ 14 more rows
Update: simpler way:
df2 %>%
add_row(year = 2020, B = 0, .before = 1) %>%
cross_join(df1) %>%
group_by(X, Y) %>%
mutate(across(c(Z, W, V), ~ . * cumprod(B + 1))) %>%
rename("C" = Z, "D" = W, "E" = V) %>%
ungroup()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论