英文:
Apply a growth rate to a given initial value conditionally within a data frame
问题
I can help you with the translation. Here's the content you provided translated to Chinese:
抱歉,如果这是一个基础问题,但作为初学者,我根本弄不清楚这个问题 - 我尝试找到类似的问题,但没有成功...
我有一个以下的数据框,在其中对于每个产品和每个区域组合,增长率会变化,并且在10年内是固定的(我从另一个数据框中进行了左连接)。
我只有当前年份的销售额,想要使用给定的增长率来预测10年,以产生sales_expected
中所述的结果。
产品 | 区域 | 年份 | 增长率 | 销售额 | 预期销售额 |
---|---|---|---|---|---|
productA | regionA | 2023 | 0.1 | 100 | 100 |
productA | regionA | 2024 | 0.1 | 110 | |
productA | regionA | 2025 | 0.1 | 121 | |
productA | regionA | 2026 | 0.1 | 133 | |
productA | regionA | 2027 | 0.1 | 146 | |
productA | regionA | 2028 | 0.1 | 161 | |
productA | regionA | 2029 | 0.1 | 177 | |
productA | regionA | 2030 | 0.1 | 195 | |
productA | regionA | 2031 | 0.1 | 214 | |
productA | regionA | 2032 | 0.1 | 236 | |
productA | regionB | 2023 | 0.06 | 100 | 100 |
productA | regionB | 2024 | 0.06 | 106 | |
productA | regionB | 2025 | 0.06 | 112 | |
... | ... | ... | ... |
在我的管道中,我尝试使用purr
包中的accumulate
:
mutate(sales_expected = accumulate(sales, fill_in))
结合fill_in
函数,该函数查找sales
列中的下一个n.a并应用增长率:
fill_in <- function(prev, new, growth) {
growth_rate <- 0.1
if_else(!is.na(new), new, prev * (1 + growth_rate))
}
这个方法有效,但我无法弄清楚如何在这个函数中有条件地更改growth_rate
参数,即它应该从growth
列中获取,而不是我手动输入,因为这显然无法处理新的产品和区域组合被引入到数据框中的情况,其中增长率会发生变化。
非常感谢您的帮助!
英文:
Apologies if its a basic question but I simply cannot figure this one out as a beginner - I tried to find similar problem but was unsucessful...
I have a following data frame in which for every product and for every region combination the growth rate changes and is fixed for 10 years (I left_join it from another data frame).
I only have sales for the current year and want to project 10 year using growth rate given trying to produce results stated in sales_expected
.
product | region | year | growth | sales | sales_expected |
---|---|---|---|---|---|
productA | regionA | 2023 | 0.1 | 100 | 100 |
productA | regionA | 2024 | 0.1 | 110 | |
productA | regionA | 2025 | 0.1 | 121 | |
productA | regionA | 2026 | 0.1 | 133 | |
productA | regionA | 2027 | 0.1 | 146 | |
productA | regionA | 2028 | 0.1 | 161 | |
productA | regionA | 2029 | 0.1 | 177 | |
productA | regionA | 2030 | 0.1 | 195 | |
productA | regionA | 2031 | 0.1 | 214 | |
productA | regionA | 2032 | 0.1 | 236 | |
productA | regionB | 2023 | 0.06 | 100 | 100 |
productA | regionB | 2024 | 0.06 | 106 | |
productA | regionB | 2025 | 0.06 | 112 | |
... | ... | ... | ... |
Within my pipeline I tried to use accumulate from purr
package:
mutate(sales_expected = accumulate(sales, fill_in))
In combination with fill_in
function which looks up the next n.a within sales
column and applies growth rate:
fill_in <- function(prev, new, growth) {
growth_rate <- 0.1
if_else(!is.na(new), new, prev * (1 + growth_rate))
}
This works but what I cannot figure out is how to change the growth_rate
argument within this function conditionally i.e. it should be picked up from column growth
instead of me typing it manually as this obviously does not work with new product & region combinations being introduced into the data frame for which growth rate changes.
thanks so much for help!
答案1
得分: 0
你可以使用group_by
和cumprod
来实现这个功能。cumprod(growth+1)
将对增长率进行复合;我使用lag
来将第一年的增长设置为0(使其与基线相同)。
dat |>
group_by(product, region) |>
mutate(sales_projected = sales[1] * cumprod(lag(growth, default = 0)+1))
如果你想要使用accumulate
函数,我认为你需要使用accumulate2()
,如下所示:
dat |>
group_by(product, region) |>
mutate(sales_expected = accumulate2(sales, growth[-1], \(x,y,g) x*(1+g)))
英文:
You could do it with group_by
and cumprod
. cumprod(growth+1)
will do the compounding of the growth rates; I used lag
to set the growth for the first year to 0 (to make it the same as the baseline).
dat |>
group_by(product, region) |>
mutate(sales_projected = sales[1] * cumprod(lag(growth, default = 0)+1))
# A tibble: 13 × 7
# Groups: product, region [2]
product region year growth sales sales_expected sales_projected
<chr> <chr> <int> <dbl> <int> <int> <dbl>
1 productA regionA 2023 0.1 100 100 100
2 productA regionA 2024 0.1 NA 110 110
3 productA regionA 2025 0.1 NA 121 121
4 productA regionA 2026 0.1 NA 133 133.
5 productA regionA 2027 0.1 NA 146 146.
6 productA regionA 2028 0.1 NA 161 161.
7 productA regionA 2029 0.1 NA 177 177.
8 productA regionA 2030 0.1 NA 195 195.
9 productA regionA 2031 0.1 NA 214 214.
10 productA regionA 2032 0.1 NA 236 236.
11 productA regionB 2023 0.06 100 100 100
12 productA regionB 2024 0.06 NA 106 106
13 productA regionB 2025 0.06 NA 112 112.
If you want to use accumulate
functions I think you need to use accumulate2()
like this:
dat |>
group_by(product, region) |>
mutate(sales_expected = accumulate2(sales, growth[-1], \(x,y,g) x*(1+g)))
# A tibble: 13 × 7
# Groups: product, region [2]
product region year growth sales sales_expected sales_projected
<chr> <chr> <int> <dbl> <int> <int> <dbl>
1 productA regionA 2023 0.1 100 100 100
2 productA regionA 2024 0.1 NA 110 110
3 productA regionA 2025 0.1 NA 121 121
4 productA regionA 2026 0.1 NA 133 133.
5 productA regionA 2027 0.1 NA 146 146.
6 productA regionA 2028 0.1 NA 161 161.
7 productA regionA 2029 0.1 NA 177 177.
8 productA regionA 2030 0.1 NA 195 195.
9 productA regionA 2031 0.1 NA 214 214.
10 productA regionA 2032 0.1 NA 236 236.
11 productA regionB 2023 0.06 100 100 100
12 productA regionB 2024 0.06 NA 106 106
13 productA regionB 2025 0.06 NA 112 112.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论