英文:
How to multiply the coefficients for each group and then calculate the percentage of the original value in R
问题
以下是您要翻译的内容:
我有一个数据集,类似下面的玩具数据集:
s=structure(list(month_id = c(202306L, 202305L, 202307L, 202305L,
202306L, 202307L), MDM_Key = c(1L, 1L, 1L, 2L, 2L, 2L), sale_count = c(NA,
19161L, NA, 17726L, NA, NA), iek_max_discount = c(0.5356, 0.5256,
0.5456, 0.559, 0.569, 0.589)), class = "data.frame", row.names = c(NA,
-6L))
和另一个具有系数的数据集
cf=structure(list(MDM_Key = 1:2, ML.coef = c(1.46, 1.67)), class = "data.frame", row.names = c(NA,
-2L))
'iek_max_discount' 是一个百分比值,即不是0.5356,而是53.56
我需要将每个`mdm_key`(来自`cf`的分组变量)的每个系数乘以最后一个当前月份(来自`s`数据集)的相应`iek_max_discount`(对应于`mdm_key`的)值。
例如,今天的当前月份是五月。这意味着对于`mdm_key =1`,我们取系数1.46并乘以六月的值(`month_id=202306`和`1,46*53,56=78`)。
得到的值也是一个百分比;现在,让我们计算78%的初始值`sale_count`是多少(它也只针对当前月份)。 `19161/100*78=14945`。
然后将结果添加到19161中,并将结果插入到六月份。
类似地,我们对七月份(`month_id=202307`)进行同样的操作;乘以系数`1,46*54,56=79`。计算比例:`19161/100*79=15137`
我们将得到的值添加到19161中,并将结果插入到七月份。
例如,对于mdm_key=1的期望输出
month_id MDM_Key sale_count iek_max_discount percent prop final
202306 1 0,5356 78 14945 34106
202305 1 19161 0,5256
202307 1 0,5456 79 15137 34298
经过相同的过程,我们对mdm_key=2(或任何其他mdm_key,如果有的话)进行操作。
执行这样一系列算术操作的最佳和最简单的方法是什么?
英文:
I have a dataset like the toy dataset below:
s=structure(list(month_id = c(202306L, 202305L, 202307L, 202305L,
202306L, 202307L), MDM_Key = c(1L, 1L, 1L, 2L, 2L, 2L), sale_count = c(NA,
19161L, NA, 17726L, NA, NA), iek_max_discount = c(0.5356, 0.5256,
0.5456, 0.559, 0.569, 0.589)), class = "data.frame", row.names = c(NA,
-6L))
and another dataset with coefficients
cf=structure(list(MDM_Key = 1:2, ML.coef = c(1.46, 1.67)), class = "data.frame", row.names = c(NA,
-2L))
iek_max_discount
is a percentage value, that is, not 0.5356, but 53.56
I need each coefficient of each mdm_key
(group variable, from cf
) multiply by the corresponding iek_max_discount
(to the corresponding mdm_key
) value from the last current month (from s
dataset).
For example, today the current month is May. This means that for mdm_key =1
, we take the coefficient 1.46 and multiply by the value of June (month_id=202306
and 1,46*53,56=78
).
The resulting value is also a percentage; now, let's calculate what proportion is 78% of the initial valuesale_count
(it is also presented only for the current month). 19161/100*78=14945
.
Then the result must be added to 19161 and sum is inserted into the month of June.
Similarly, we do this for the month of July (month_id=202307
); multiply by the coefficient 1,46*54,56=79
. Calculate proportion: 19161/100*79=15137
We add the resulting value to 19161 and insert the result into the month of July.
For example desired output for mdm_key=1
month_id MDM_Key sale_count iek_max_discount percent prop final
202306 1 0,5356 78 14945 34106
202305 1 19161 0,5256
202307 1 0,5456 79 15137 34298
After the same procedure we do for mdm_key=2 (or any other mdm_key, if any.)
What is the best and easiest way to do such a sequence of arithmetic operations?
答案1
得分: 1
假设总是有一个第一个月,每个 *MDM_Key* 有一个 *sale_count*
library(dplyr)
merge(s, cf) %>%
arrange(MDM_Key, month_id) %>%
mutate(ML.coef = if_else(!is.na(sale_count), NA, ML.coef),
percent = iek_max_discount * ML.coef * 100,
prop = (percent / 100) * sale_count[1],
final = sale_count[1] + prop, .by = MDM_Key)
MDM_Key month_id sale_count iek_max_discount ML.coef percent prop
1 1 202305 19161 0.5256 NA NA NA
2 1 202306 NA 0.5356 1.46 78.1976 14983.44
3 1 202307 NA 0.5456 1.46 79.6576 15263.19
4 2 202305 17726 0.5590 NA NA NA
5 2 202306 NA 0.5690 1.67 95.0230 16843.78
6 2 202307 NA 0.5890 1.67 98.3630 17435.83
final
1 NA
2 34144.44
3 34424.19
4 NA
5 34569.78
6 35161.83
<details>
<summary>英文:</summary>
Assuming there is always a first month that has a *sale_count* per *MDM_Key*
library(dplyr)
merge(s, cf) %>%
arrange(MDM_Key, month_id) %>%
mutate(ML.coef = if_else(!is.na(sale_count), NA, ML.coef),
percent = iek_max_discount * ML.coef * 100,
prop = (percent / 100) * sale_count[1],
final = sale_count[1] + prop, .by = MDM_Key)
MDM_Key month_id sale_count iek_max_discount ML.coef percent prop
1 1 202305 19161 0.5256 NA NA NA
2 1 202306 NA 0.5356 1.46 78.1976 14983.44
3 1 202307 NA 0.5456 1.46 79.6576 15263.19
4 2 202305 17726 0.5590 NA NA NA
5 2 202306 NA 0.5690 1.67 95.0230 16843.78
6 2 202307 NA 0.5890 1.67 98.3630 17435.83
final
1 NA
2 34144.44
3 34424.19
4 NA
5 34569.78
6 35161.83
</details>
# 答案2
**得分**: 1
``` r
library(dplyr)
cur_month <- as.integer(202305)
s %>%
right_join(cf, by = "MDM_Key") %>%
mutate(percent = floor(if_else(month_id == cur_month,
NA, ML.coef * iek_max_discount * 100)),
prop = floor(percent * sale_count[month_id == cur_month]/100),
final = sale_count[month_id == cur_month] + prop,
.by = MDM_Key)
#> month_id MDM_Key sale_count iek_max_discount ML.coef percent prop final
#> 1 202306 1 NA 0.5356 1.46 78 14945 34106
#> 2 202305 1 19161 0.5256 1.46 NA NA NA
#> 3 202307 1 NA 0.5456 1.46 79 15137 34298
#> 4 202305 2 17726 0.5590 1.67 NA NA NA
#> 5 202306 2 NA 0.5690 1.67 95 16839 34565
#> 6 202307 2 NA 0.5890 1.67 98 17371 35097
英文:
library(dplyr)
cur_month <- as.integer(202305)
s %>%
right_join(cf, by = "MDM_Key") %>%
mutate(percent = floor(if_else(month_id == cur_month,
NA, ML.coef * iek_max_discount * 100)),
prop = floor(percent * sale_count[month_id == cur_month]/100),
final = sale_count[month_id == cur_month] + prop,
.by = MDM_Key)
#> month_id MDM_Key sale_count iek_max_discount ML.coef percent prop final
#> 1 202306 1 NA 0.5356 1.46 78 14945 34106
#> 2 202305 1 19161 0.5256 1.46 NA NA NA
#> 3 202307 1 NA 0.5456 1.46 79 15137 34298
#> 4 202305 2 17726 0.5590 1.67 NA NA NA
#> 5 202306 2 NA 0.5690 1.67 95 16839 34565
#> 6 202307 2 NA 0.5890 1.67 98 17371 35097
答案3
得分: 0
这是翻译后的代码部分:
这个部分有点难以理解你尝试做什么。以下是我最佳猜测:
# 合并它们以使键在同一行上
scf <- merge(s, cf, by = "MDM_Key")
# 按每个唯一的MDM_Key拆分为列表
s_by_key <- split(scf, s$MDM_Key)
# 遍历列表中的每个元素
for(i in 1:length(s_by_key)){
# 截断以去除小数部分
s_by_key[[i]]$percent <- trunc(s_by_key[[i]]$iek_max_discount * s_by_key[[i]]$ML.coef * 100)
# 获取最近的销售月份
months_with_sales <- which(!is.na(s_by_key[[i]]$sale_count))
latest_month_with_sales <- which.max(s_by_key[[i]]$month_id[months_with_sales])
last_sales <- s_by_key[[i]][latest_month_with_sales, "sale_count"]
s_by_key[[i]]$prop <- trunc(last_sales / 100 * s_by_key[[i]]$percent)
s_by_key[[i]]$final <- last_sales + s_by_key[[i]]$prop
# 并将这些行设置为NA,以符合所需的输出
s_by_key[[i]][latest_month_with_sales, c("percent","prop","final")] <- NA
}
output <- do.call(rbind, s_by_key)
output
# MDM_Key month_id sale_count iek_max_discount ML.coef percent prop final
#1.1 1 202306 NA 0.5356 1.46 78 14945 34106
#1.2 1 202305 19161 0.5256 1.46 NA NA NA
#1.3 1 202307 NA 0.5456 1.46 79 15137 34298
#2.4 2 202305 17726 0.5590 1.67 NA NA NA
#2.5 2 202306 NA 0.5690 1.67 95 16839 34565
#2.6 2 202307 NA 0.5890 1.67 98 17371 35097
希望这可以帮助你理解代码的内容。
英文:
It was a little hard to understand what you were trying to do. Here is my best guess:
# merge them together to have the key in the same row
scf <- merge(s, cf, by = "MDM_Key")
# split into lists by each unique MDM_Key
s_by_key <- split(scf, s$MDM_Key)
# loop through each element of the list
for(i in 1:length(s_by_key)){
# trunc to remove the decimals
s_by_key[[i]]$percent <- trunc(s_by_key[[i]]$iek_max_discount * s_by_key[[i]]$ML.coef * 100)
# get the most recent sales month
months_with_sales <- which(!is.na(s_by_key[[i]]$sale_count))
latest_month_with_sales <- which.max(s_by_key[[i]]$month_id[months_with_sales])
last_sales <- s_by_key[[i]][latest_month_with_sales, "sale_count"]
s_by_key[[i]]$prop <- trunc(last_sales / 100 * s_by_key[[i]]$percent)
s_by_key[[i]]$final <- last_sales + s_by_key[[i]]$prop
# and set those rows to NA to be like the desired output
s_by_key[[i]][latest_month_with_sales, c("percent","prop","final")] <- NA
}
output <- do.call(rbind, s_by_key)
output
# MDM_Key month_id sale_count iek_max_discount ML.coef percent prop final
#1.1 1 202306 NA 0.5356 1.46 78 14945 34106
#1.2 1 202305 19161 0.5256 1.46 NA NA NA
#1.3 1 202307 NA 0.5456 1.46 79 15137 34298
#2.4 2 202305 17726 0.5590 1.67 NA NA NA
#2.5 2 202306 NA 0.5690 1.67 95 16839 34565
#2.6 2 202307 NA 0.5890 1.67 98 17371 35097
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论