使用累加函数accumulate和/或cumsum来正确进行投影。

huangapple go评论86阅读模式
英文:

Using accumulate and/or cumsum properly for projections

问题

我需要投影df1中每行的Z、W、Y值。为此,我将使用df2中的B的增长率。

  1. df1:
  2. X Y year Z W V
  3. abc ab 2020 0.1 0.7 1.3
  4. abc cd 2020 0.2 0.8 1.4
  5. efg ef 2020 0.3 0.9 1.5
  6. efg gh 2020 0.4 1 1.6
  7. df2:
  8. year B
  9. 2021 0.05
  10. 2022 0.063
  11. 2023 0.049
  12. 2024 0.061
  13. 2025 0.057

我理解可以通过交叉连接来解决这个问题。然而,在此之前,我需要确保前一个值用于下一次计算。

我期望得到类似这样的结果:

  1. df1中提取
  2. X Y year Z W V
  3. abc ab 2020 0.1 0.7 1.3
  4. df3: (已编辑)
  5. year B C D E X Y
  6. 2020 0 0.1000 0.7000 1.3000 abc ab
  7. 2021 0.05 0.1050 0.7350 1.3650 abc ab
  8. 2022 0.063 0.1116 0.7813 1.4510 abc ab
  9. 2023 0.049 0.1171 0.8196 1.5221 abc ab
  10. 2024 0.061 0.1242 0.8696 1.6149 abc ab
  11. 2025 0.057 0.1313 0.9192 1.7070 abc ab
  12. df3的公式 (已编辑)
  13. 1 year B C D E
  14. 2 2020 0 0.1 0.7 1.3
  15. 3 2021 0.05 =+C2*($B$3+1) =+D2*($B$3+1) =+E2*($B$3+1)
  16. 4 2022 0.063 =+C3*($B$4+1) =+D3*($B$4+1) =+E3*($B$4+1)
  17. 5 2023 0.049 =+C4*($B$5+1) =+D4*($B$5+1) =+E4*($B$5+1)
  18. 6 2024 0.061 =+C5*($B$6+1) =+D5*($B$6+1) =+E5*($B$6+1)
  19. 7 2025 0.057 =+C6*($B$7+1) =+D6*($B$7+1) =+E6*($B$7+1)

要获得df3,我尝试过以下方法:

  1. df3 <- df2 %>%
  2. mutate(example2 = accumulate(B, ~ 0.1 * ( .x +1)))
  3. df3 <- df2 %>%
  4. mutate(example2 = cumsum(accumulate(B, ~ 0.1 * ( .x +1))))

结果仍然不符合预期。

最后,我需要将其传递给交叉连接和跨函数,类似于以下方式:

  1. df3 <- mutate(cross_join(df1[-3], df2), across(Z:V)*(H), H = NULL)
英文:

I need to project each Z, W, Y value of each row in df1. To do this, I will use the growth rates in B from df2.

  1. df1:
  2. X Y year Z W V
  3. abc ab 2020 0.1 0.7 1.3
  4. abc cd 2020 0.2 0.8 1.4
  5. efg ef 2020 0.3 0.9 1.5
  6. efg gh 2020 0.4 1 1.6
  7. df2:
  8. year B
  9. 2021 0.05
  10. 2022 0.063
  11. 2023 0.049
  12. 2024 0.061
  13. 2025 0.057

I understand that a cross join might solve the problem. However, before that I need to be sure that the precious value is used for the next calculation.

I expect something like this:

  1. from df1:
  2. X Y year Z W V
  3. abc ab 2020 0.1 0.7 1.3
  4. df3: (edited)
  5. year B C D E X Y
  6. 2020 0 0.1000 0.7000 1.3000 abc ab
  7. 2021 0.05 0.1050 0.7350 1.3650 abc ab
  8. 2022 0.063 0.1116 0.7813 1.4510 abc ab
  9. 2023 0.049 0.1171 0.8196 1.5221 abc ab
  10. 2024 0.061 0.1242 0.8696 1.6149 abc ab
  11. 2025 0.057 0.1313 0.9192 1.7070 abc ab
  12. Formulas for df3 (edited)
  13. 1 year B C D E
  14. 2 2020 0 0.1 0.7 1.3
  15. 3 2021 0.05 =+C2*($B$3+1) =+D2*($B$3+1) =+E2*($B$3+1)
  16. 4 2022 0.063 =+C3*($B$4+1) =+D3*($B$4+1) =+E3*($B$4+1)
  17. 5 2023 0.049 =+C4*($B$5+1) =+D4*($B$5+1) =+E4*($B$5+1)
  18. 6 2024 0.061 =+C5*($B$6+1) =+D5*($B$6+1) =+E5*($B$6+1)
  19. 7 2025 0.057 =+C6*($B$7+1) =+D6*($B$7+1) =+E6*($B$7+1)

To get df3, I have tried:

  1. df3 &lt;- df2 %&gt;% mutate(example2 = accumulate(B, ~ 0.1 * ( .x +1)))
  2. df3 &lt;- df2 %&gt;% mutate(example2 = cumsum(accumulate(B, ~ 0.1 * ( .x +1))))

The outcome is still not the expected.

At the end, I would need to pass it into the cross join and across functions, something like this:

  1. df3 &lt;- mutate(cross_join(df1[-3], df2), across(Z:V)*(H), H = NULL)

答案1

得分: 2

以下是翻译好的内容:

首先,我们对df2进行了一些操作。可能有更简单的方法(但不幸的是,总是有):

  1. # 首先,我们对df2进行一些操作
  2. # 1. 我们添加了一个名为2020的列
  3. # 2. 我们将利率转换为1 + 利率(以使累积乘积函数起作用)
  4. # 3. 我们使用cumprod函数将B列转换为B + 1的累积乘积
  5. df2 <- df2 %>%
  6. add_row(year = 2020, B = 0, .before = 1) %>%
  7. mutate(B = cumprod(B + 1))
  8. # 从df1中删除年份,因为它是不必要的,而且我们以后肯定还得删除它
  9. df1 <- df1 %>%
  10. select(-year)
  11. # df3
  12. cross_join(df1, df2) %>%
  13. mutate(C = Z * B,
  14. D = W * B,
  15. E = V * B) %>%
  16. select(year, B, C, D, E, X, Y)
  17. # 一个tibble:24 × 7
  18. year B C D E X Y
  19. <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
  20. 1 2020 1 0.1 0.7 1.3 abc ab
  21. 2 2021 1.05 0.105 0.735 1.37 abc ab
  22. 3 2022 1.12 0.112 0.781 1.45 abc ab
  23. 4 2023 1.17 0.117 0.820 1.52 abc ab
  24. 5 2024 1.24 0.124 0.870 1.61 abc ab
  25. 6 2025 1.31 0.131 0.919 1.71 abc ab
  26. 7 2020 1 0.2 0.8 1.4 abc cd
  27. 8 2021 1.05 0.21 0.84 1.47 abc cd
  28. 9 2022 1.12 0.223 0.893 1.56 abc cd
  29. 10 2023 1.17 0.234 0.937 1.64 abc cd
  30. # ℹ 还有14行

更新:更简单的方法:

  1. df2 %>%
  2. add_row(year = 2020, B = 0, .before = 1) %>%
  3. cross_join(df1) %>%
  4. group_by(X, Y) %>%
  5. mutate(across(c(Z, W, V), ~ . * cumprod(B + 1))) %>%
  6. rename("C" = Z, "D" = W, "E" = V) %>%
  7. ungroup()

希望这可以帮助您理解代码的内容。

英文:

Here is a first take on doing it. There's likely a simpler way (but alas, there always is):

  1. # first, we do a few things to df2
  2. # 1. we add a 2020 column
  3. # 2. we transform the interest rate to 1 + the interest rate (to enable the cumulative product function to work
  4. # 3. we use the cumprod function to turn the B column into the cumulative products of B + 1
  5. df2 &lt;- df2 %&gt;%
  6. add_row(year = 2020, B = 0, .before = 1) %&gt;%
  7. mutate(B = cumprod(B + 1))
  8. # remove the year from df1, because it&#39;s unnecessary, and we&#39;d have to remove it later anyway
  9. df1 &lt;- df1 %&gt;%
  10. select(-year)
  11. # df3
  12. cross_join(df1, df2) %&gt;%
  13. mutate(C = Z * B,
  14. D = W * B,
  15. E = V * B) %&gt;%
  16. select(year, B, C, D, E, X, Y)
  17. # A tibble: 24 &#215; 7
  18. year B C D E X Y
  19. &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;
  20. 1 2020 1 0.1 0.7 1.3 abc ab
  21. 2 2021 1.05 0.105 0.735 1.37 abc ab
  22. 3 2022 1.12 0.112 0.781 1.45 abc ab
  23. 4 2023 1.17 0.117 0.820 1.52 abc ab
  24. 5 2024 1.24 0.124 0.870 1.61 abc ab
  25. 6 2025 1.31 0.131 0.919 1.71 abc ab
  26. 7 2020 1 0.2 0.8 1.4 abc cd
  27. 8 2021 1.05 0.21 0.84 1.47 abc cd
  28. 9 2022 1.12 0.223 0.893 1.56 abc cd
  29. 10 2023 1.17 0.234 0.937 1.64 abc cd
  30. # ℹ 14 more rows

Update: simpler way:

  1. df2 %&gt;%
  2. add_row(year = 2020, B = 0, .before = 1) %&gt;%
  3. cross_join(df1) %&gt;%
  4. group_by(X, Y) %&gt;%
  5. mutate(across(c(Z, W, V), ~ . * cumprod(B + 1))) %&gt;%
  6. rename(&quot;C&quot; = Z, &quot;D&quot; = W, &quot;E&quot; = V) %&gt;%
  7. ungroup()

huangapple
  • 本文由 发表于 2023年7月10日 15:15:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76651465.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定