2023年5月22日 21:44:44go评论93阅读模式

英文:

Apply a growth rate to a given initial value conditionally within a data frame

问题

I can help you with the translation. Here's the content you provided translated to Chinese:

抱歉，如果这是一个基础问题，但作为初学者，我根本弄不清楚这个问题 - 我尝试找到类似的问题，但没有成功...

我有一个以下的数据框，在其中对于每个产品和每个区域组合，增长率会变化，并且在10年内是固定的（我从另一个数据框中进行了左连接）。

我只有当前年份的销售额，想要使用给定的增长率来预测10年，以产生sales_expected中所述的结果。

产品	区域	年份	增长率	销售额	预期销售额
productA	regionA	2023	0.1	100	100
productA	regionA	2024	0.1		110
productA	regionA	2025	0.1		121
productA	regionA	2026	0.1		133
productA	regionA	2027	0.1		146
productA	regionA	2028	0.1		161
productA	regionA	2029	0.1		177
productA	regionA	2030	0.1		195
productA	regionA	2031	0.1		214
productA	regionA	2032	0.1		236
productA	regionB	2023	0.06	100	100
productA	regionB	2024	0.06		106
productA	regionB	2025	0.06		112
...	...	...	...

在我的管道中，我尝试使用purr包中的accumulate：

mutate(sales_expected = accumulate(sales, fill_in))

结合fill_in函数，该函数查找sales列中的下一个n.a并应用增长率：

fill_in <- function(prev, new, growth) {
  growth_rate <- 0.1
  if_else(!is.na(new), new, prev * (1 + growth_rate))
}

这个方法有效，但我无法弄清楚如何在这个函数中有条件地更改growth_rate参数，即它应该从growth列中获取，而不是我手动输入，因为这显然无法处理新的产品和区域组合被引入到数据框中的情况，其中增长率会发生变化。

非常感谢您的帮助！

英文:

Apologies if its a basic question but I simply cannot figure this one out as a beginner - I tried to find similar problem but was unsucessful...

I have a following data frame in which for every product and for every region combination the growth rate changes and is fixed for 10 years (I left_join it from another data frame).

I only have sales for the current year and want to project 10 year using growth rate given trying to produce results stated in sales_expected.

product	region	year	growth	sales	sales_expected
productA	regionA	2023	0.1	100	100
productA	regionA	2024	0.1		110
productA	regionA	2025	0.1		121
productA	regionA	2026	0.1		133
productA	regionA	2027	0.1		146
productA	regionA	2028	0.1		161
productA	regionA	2029	0.1		177
productA	regionA	2030	0.1		195
productA	regionA	2031	0.1		214
productA	regionA	2032	0.1		236
productA	regionB	2023	0.06	100	100
productA	regionB	2024	0.06		106
productA	regionB	2025	0.06		112
...	...	...	...

Within my pipeline I tried to use accumulate from purr package:

mutate(sales_expected = accumulate(sales, fill_in))

In combination with fill_in function which looks up the next n.a within sales column and applies growth rate:


fill_in &lt;- function(prev, new, growth) {
  growth_rate &lt;- 0.1
  if_else(!is.na(new), new, prev * (1 + growth_rate))
}

This works but what I cannot figure out is how to change the growth_rate argument within this function conditionally i.e. it should be picked up from column growth instead of me typing it manually as this obviously does not work with new product & region combinations being introduced into the data frame for which growth rate changes.

thanks so much for help!

答案1

得分: 0

你可以使用group_by和cumprod来实现这个功能。cumprod(growth+1)将对增长率进行复合；我使用lag来将第一年的增长设置为0（使其与基线相同）。

dat |&gt; 
  group_by(product, region) |&gt; 
  mutate(sales_projected = sales[1] * cumprod(lag(growth, default = 0)+1))

如果你想要使用accumulate函数，我认为你需要使用accumulate2()，如下所示：

dat |&gt; 
  group_by(product, region) |&gt; 
  mutate(sales_expected = accumulate2(sales, growth[-1], \(x,y,g) x*(1+g)))

英文:

You could do it with group_by and cumprod. cumprod(growth+1) will do the compounding of the growth rates; I used lag to set the growth for the first year to 0 (to make it the same as the baseline).

dat |&gt; 
  group_by(product, region) |&gt; 
  mutate(sales_projected = sales[1] * cumprod(lag(growth, default = 0)+1))

# A tibble: 13 &#215; 7
# Groups:   product, region [2]
   product  region   year growth sales sales_expected sales_projected
   &lt;chr&gt;    &lt;chr&gt;   &lt;int&gt;  &lt;dbl&gt; &lt;int&gt;          &lt;int&gt;           &lt;dbl&gt;
 1 productA regionA  2023   0.1    100            100            100 
 2 productA regionA  2024   0.1     NA            110            110 
 3 productA regionA  2025   0.1     NA            121            121 
 4 productA regionA  2026   0.1     NA            133            133.
 5 productA regionA  2027   0.1     NA            146            146.
 6 productA regionA  2028   0.1     NA            161            161.
 7 productA regionA  2029   0.1     NA            177            177.
 8 productA regionA  2030   0.1     NA            195            195.
 9 productA regionA  2031   0.1     NA            214            214.
10 productA regionA  2032   0.1     NA            236            236.
11 productA regionB  2023   0.06   100            100            100 
12 productA regionB  2024   0.06    NA            106            106 
13 productA regionB  2025   0.06    NA            112            112.

If you want to use accumulate functions I think you need to use accumulate2() like this:

dat |&gt; 
  group_by(product, region) |&gt; 
  mutate(sales_expected = accumulate2(sales, growth[-1], \(x,y,g) x*(1+g)))

# A tibble: 13 &#215; 7
# Groups:   product, region [2]
   product  region   year growth sales sales_expected sales_projected
   &lt;chr&gt;    &lt;chr&gt;   &lt;int&gt;  &lt;dbl&gt; &lt;int&gt;          &lt;int&gt;           &lt;dbl&gt;
 1 productA regionA  2023   0.1    100            100            100 
 2 productA regionA  2024   0.1     NA            110            110 
 3 productA regionA  2025   0.1     NA            121            121 
 4 productA regionA  2026   0.1     NA            133            133.
 5 productA regionA  2027   0.1     NA            146            146.
 6 productA regionA  2028   0.1     NA            161            161.
 7 productA regionA  2029   0.1     NA            177            177.
 8 productA regionA  2030   0.1     NA            195            195.
 9 productA regionA  2031   0.1     NA            214            214.
10 productA regionA  2032   0.1     NA            236            236.
11 productA regionB  2023   0.06   100            100            100 
12 productA regionB  2024   0.06    NA            106            106 
13 productA regionB  2025   0.06    NA            112            112.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在数据框内有条件地将增长率应用于给定的初始值。

问题

答案1

Having the error "Error in copy_msts(y, fitted) : x and y should have the same number of observations"

Using R to plot a stacked bargraph but the legend does not show up, using GridDB as my database.

如何使用逗号分隔的数字作为数值变量

在R中使用ggplot绘制多个变量在y轴上的图表。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。