使用累加函数accumulate和/或cumsum来正确进行投影。

huangapple go评论62阅读模式
英文:

Using accumulate and/or cumsum properly for projections

问题

我需要投影df1中每行的Z、W、Y值。为此,我将使用df2中的B的增长率。

df1:
X	Y	year	Z	W	V
abc	ab	2020	0.1	0.7	1.3
abc	cd	2020	0.2	0.8	1.4
efg	ef	2020	0.3	0.9	1.5
efg	gh	2020	0.4	1	1.6

df2:
year	B	    
2021	0.05	
2022	0.063	
2023	0.049	
2024	0.061	
2025	0.057	

我理解可以通过交叉连接来解决这个问题。然而,在此之前,我需要确保前一个值用于下一次计算。

我期望得到类似这样的结果:

从df1中提取
X	Y	year	Z	   W	    V
abc	ab	2020	0.1	   0.7	  1.3

df3: (已编辑)
year	B	    C	    D	    E	    X	Y
2020	0	    0.1000	0.7000	1.3000	abc	ab 
2021	0.05	0.1050	0.7350	1.3650	abc	ab 
2022	0.063	0.1116	0.7813	1.4510	abc	ab 
2023	0.049	0.1171	0.8196	1.5221	abc	ab 
2024	0.061	0.1242	0.8696	1.6149	abc	ab 
2025	0.057	0.1313	0.9192	1.7070	abc	ab  

df3的公式 (已编辑)
1    year	B	    C	            D	            E
2    2020	0	    0.1	            0.7	            1.3
3    2021	0.05	=+C2*($B$3+1)	=+D2*($B$3+1)	=+E2*($B$3+1)
4    2022	0.063	=+C3*($B$4+1)	=+D3*($B$4+1)	=+E3*($B$4+1)
5    2023	0.049	=+C4*($B$5+1)	=+D4*($B$5+1)	=+E4*($B$5+1)
6    2024	0.061	=+C5*($B$6+1)	=+D5*($B$6+1)	=+E5*($B$6+1)
7    2025	0.057	=+C6*($B$7+1)	=+D6*($B$7+1)	=+E6*($B$7+1)

要获得df3,我尝试过以下方法:

df3 <- df2 %>%
  mutate(example2 = accumulate(B, ~ 0.1 * ( .x +1)))

df3 <- df2 %>%
  mutate(example2 = cumsum(accumulate(B, ~ 0.1 * ( .x +1))))

结果仍然不符合预期。

最后,我需要将其传递给交叉连接和跨函数,类似于以下方式:

df3 <- mutate(cross_join(df1[-3], df2), across(Z:V)*(H), H = NULL)
英文:

I need to project each Z, W, Y value of each row in df1. To do this, I will use the growth rates in B from df2.

df1:
X	 Y	year 	Z	   W	  V
abc	ab 	2020	0.1	 0.7	1.3
abc	cd 	2020	0.2	 0.8	1.4
efg	ef 	2020	0.3	 0.9	1.5
efg	gh 	2020	0.4	 1	    1.6
df2:
year	B	    
2021	0.05	
2022	0.063	
2023	0.049	
2024	0.061	
2025	0.057	

I understand that a cross join might solve the problem. However, before that I need to be sure that the precious value is used for the next calculation.

I expect something like this:

from df1:
X	 Y	year 	Z	   W	  V
abc	ab 	2020	0.1	   0.7	  1.3
df3: (edited)
year	B	    C	    D	    E	    X	Y
2020	0	    0.1000	0.7000	1.3000	abc	ab 
2021	0.05	0.1050	0.7350	1.3650	abc	ab 
2022	0.063	0.1116	0.7813	1.4510	abc	ab 
2023	0.049	0.1171	0.8196	1.5221	abc	ab 
2024	0.061	0.1242	0.8696	1.6149	abc	ab 
2025	0.057	0.1313	0.9192	1.7070	abc	ab  
Formulas for df3 (edited)
1    year	B	    C	            D	            E
2    2020	0	    0.1	            0.7	            1.3
3    2021	0.05	=+C2*($B$3+1)	=+D2*($B$3+1)	=+E2*($B$3+1)
4    2022	0.063	=+C3*($B$4+1)	=+D3*($B$4+1)	=+E3*($B$4+1)
5    2023	0.049	=+C4*($B$5+1)	=+D4*($B$5+1)	=+E4*($B$5+1)
6    2024	0.061	=+C5*($B$6+1)	=+D5*($B$6+1)	=+E5*($B$6+1)
7    2025	0.057	=+C6*($B$7+1)	=+D6*($B$7+1)	=+E6*($B$7+1)

To get df3, I have tried:

df3 &lt;- df2 %&gt;% mutate(example2 = accumulate(B, ~ 0.1 * ( .x +1)))
df3 &lt;- df2 %&gt;% mutate(example2 = cumsum(accumulate(B, ~ 0.1 * ( .x +1))))

The outcome is still not the expected.

At the end, I would need to pass it into the cross join and across functions, something like this:

df3 &lt;- mutate(cross_join(df1[-3], df2), across(Z:V)*(H), H = NULL)

答案1

得分: 2

以下是翻译好的内容:

首先,我们对df2进行了一些操作。可能有更简单的方法(但不幸的是,总是有):

# 首先,我们对df2进行一些操作
# 1. 我们添加了一个名为2020的列
# 2. 我们将利率转换为1 + 利率(以使累积乘积函数起作用)
# 3. 我们使用cumprod函数将B列转换为B + 1的累积乘积
df2 <- df2 %>%
  add_row(year = 2020, B = 0, .before = 1) %>%
  mutate(B = cumprod(B + 1))

# 从df1中删除年份,因为它是不必要的,而且我们以后肯定还得删除它
df1 <- df1 %>%
    select(-year)

# df3
cross_join(df1, df2) %>%
    mutate(C = Z * B,
        D = W * B,
        E = V * B) %>%
    select(year, B, C, D, E, X, Y)

# 一个tibble:24 × 7
    year     B     C     D     E X     Y    
   <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
 1  2020  1    0.1   0.7    1.3  abc   ab   
 2  2021  1.05 0.105 0.735  1.37 abc   ab   
 3  2022  1.12 0.112 0.781  1.45 abc   ab   
 4  2023  1.17 0.117 0.820  1.52 abc   ab   
 5  2024  1.24 0.124 0.870  1.61 abc   ab   
 6  2025  1.31 0.131 0.919  1.71 abc   ab   
 7  2020  1    0.2   0.8    1.4  abc   cd   
 8  2021  1.05 0.21  0.84   1.47 abc   cd   
 9  2022  1.12 0.223 0.893  1.56 abc   cd   
10  2023  1.17 0.234 0.937  1.64 abc   cd   
# ℹ 还有14行

更新:更简单的方法:

df2 %>%
    add_row(year = 2020, B = 0, .before = 1) %>%
    cross_join(df1) %>%
    group_by(X, Y) %>%
    mutate(across(c(Z, W, V), ~ . * cumprod(B + 1))) %>%
    rename("C" = Z, "D" = W, "E" = V) %>%
    ungroup()

希望这可以帮助您理解代码的内容。

英文:

Here is a first take on doing it. There's likely a simpler way (but alas, there always is):

# first, we do a few things to df2
# 1. we add a 2020 column
# 2. we transform the interest rate to 1 + the interest rate (to enable the cumulative product function to work
# 3. we use the cumprod function to turn the B column into the cumulative products of B + 1
df2 &lt;- df2 %&gt;% 
add_row(year = 2020, B = 0, .before = 1) %&gt;%
mutate(B = cumprod(B + 1))
# remove the year from df1, because it&#39;s unnecessary, and we&#39;d have to remove it later anyway
df1 &lt;- df1 %&gt;%
select(-year)
# df3
cross_join(df1, df2) %&gt;%
mutate(C = Z * B,
D = W * B,
E = V * B) %&gt;%
select(year, B, C, D, E, X, Y)
# A tibble: 24 &#215; 7
year     B     C     D     E X     Y    
&lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;
1  2020  1    0.1   0.7    1.3  abc   ab   
2  2021  1.05 0.105 0.735  1.37 abc   ab   
3  2022  1.12 0.112 0.781  1.45 abc   ab   
4  2023  1.17 0.117 0.820  1.52 abc   ab   
5  2024  1.24 0.124 0.870  1.61 abc   ab   
6  2025  1.31 0.131 0.919  1.71 abc   ab   
7  2020  1    0.2   0.8    1.4  abc   cd   
8  2021  1.05 0.21  0.84   1.47 abc   cd   
9  2022  1.12 0.223 0.893  1.56 abc   cd   
10  2023  1.17 0.234 0.937  1.64 abc   cd   
# ℹ 14 more rows

Update: simpler way:

df2 %&gt;%
add_row(year = 2020, B = 0, .before = 1) %&gt;%
cross_join(df1) %&gt;%
group_by(X, Y) %&gt;%
mutate(across(c(Z, W, V), ~ . * cumprod(B + 1))) %&gt;%
rename(&quot;C&quot; = Z, &quot;D&quot; = W, &quot;E&quot; = V) %&gt;%
ungroup()

huangapple
  • 本文由 发表于 2023年7月10日 15:15:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76651465.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定