在R中如何对每个组的系数进行乘法运算,然后计算原始值的百分比。

huangapple go评论140阅读模式
英文:

How to multiply the coefficients for each group and then calculate the percentage of the original value in R

问题

以下是您要翻译的内容:

  1. 我有一个数据集,类似下面的玩具数据集:
  2. s=structure(list(month_id = c(202306L, 202305L, 202307L, 202305L,
  3. 202306L, 202307L), MDM_Key = c(1L, 1L, 1L, 2L, 2L, 2L), sale_count = c(NA,
  4. 19161L, NA, 17726L, NA, NA), iek_max_discount = c(0.5356, 0.5256,
  5. 0.5456, 0.559, 0.569, 0.589)), class = "data.frame", row.names = c(NA,
  6. -6L))
  7. 和另一个具有系数的数据集
  8. cf=structure(list(MDM_Key = 1:2, ML.coef = c(1.46, 1.67)), class = "data.frame", row.names = c(NA,
  9. -2L))
  10. 'iek_max_discount' 是一个百分比值,即不是0.5356,而是53.56
  11. 我需要将每个`mdm_key`(来自`cf`的分组变量)的每个系数乘以最后一个当前月份(来自`s`数据集)的相应`iek_max_discount`(对应于`mdm_key`的)值。
  12. 例如,今天的当前月份是五月。这意味着对于`mdm_key =1`,我们取系数1.46并乘以六月的值(`month_id=202306``1,46*53,56=78`)。
  13. 得到的值也是一个百分比;现在,让我们计算78%的初始值`sale_count`是多少(它也只针对当前月份)。 `19161/100*78=14945`
  14. 然后将结果添加到19161中,并将结果插入到六月份。
  15. 类似地,我们对七月份(`month_id=202307`)进行同样的操作;乘以系数`1,46*54,56=79`。计算比例:`19161/100*79=15137`
  16. 我们将得到的值添加到19161中,并将结果插入到七月份。
  17. 例如,对于mdm_key=1的期望输出
  18. month_id MDM_Key sale_count iek_max_discount percent prop final
  19. 202306 1 0,5356 78 14945 34106
  20. 202305 1 19161 0,5256
  21. 202307 1 0,5456 79 15137 34298
  22. 经过相同的过程,我们对mdm_key=2(或任何其他mdm_key,如果有的话)进行操作。
  23. 执行这样一系列算术操作的最佳和最简单的方法是什么?
英文:

I have a dataset like the toy dataset below:

  1. s=structure(list(month_id = c(202306L, 202305L, 202307L, 202305L,
  2. 202306L, 202307L), MDM_Key = c(1L, 1L, 1L, 2L, 2L, 2L), sale_count = c(NA,
  3. 19161L, NA, 17726L, NA, NA), iek_max_discount = c(0.5356, 0.5256,
  4. 0.5456, 0.559, 0.569, 0.589)), class = "data.frame", row.names = c(NA,
  5. -6L))

and another dataset with coefficients

  1. cf=structure(list(MDM_Key = 1:2, ML.coef = c(1.46, 1.67)), class = "data.frame", row.names = c(NA,
  2. -2L))

iek_max_discount is a percentage value, that is, not 0.5356, but 53.56

I need each coefficient of each mdm_key(group variable, from cf) multiply by the corresponding iek_max_discount(to the corresponding mdm_key) value from the last current month (from s dataset).

For example, today the current month is May. This means that for mdm_key =1, we take the coefficient 1.46 and multiply by the value of June (month_id=202306 and 1,46*53,56=78).

The resulting value is also a percentage; now, let's calculate what proportion is 78% of the initial valuesale_count (it is also presented only for the current month). 19161/100*78=14945.

Then the result must be added to 19161 and sum is inserted into the month of June.

Similarly, we do this for the month of July (month_id=202307); multiply by the coefficient 1,46*54,56=79. Calculate proportion: 19161/100*79=15137

We add the resulting value to 19161 and insert the result into the month of July.

For example desired output for mdm_key=1

  1. month_id MDM_Key sale_count iek_max_discount percent prop final
  2. 202306 1 0,5356 78 14945 34106
  3. 202305 1 19161 0,5256
  4. 202307 1 0,5456 79 15137 34298

After the same procedure we do for mdm_key=2 (or any other mdm_key, if any.)

What is the best and easiest way to do such a sequence of arithmetic operations?

答案1

得分: 1

  1. 假设总是有一个第一个月,每个 *MDM_Key* 有一个 *sale_count*

library(dplyr)

merge(s, cf) %>%
arrange(MDM_Key, month_id) %>%
mutate(ML.coef = if_else(!is.na(sale_count), NA, ML.coef),
percent = iek_max_discount * ML.coef * 100,
prop = (percent / 100) * sale_count[1],
final = sale_count[1] + prop, .by = MDM_Key)
MDM_Key month_id sale_count iek_max_discount ML.coef percent prop
1 1 202305 19161 0.5256 NA NA NA
2 1 202306 NA 0.5356 1.46 78.1976 14983.44
3 1 202307 NA 0.5456 1.46 79.6576 15263.19
4 2 202305 17726 0.5590 NA NA NA
5 2 202306 NA 0.5690 1.67 95.0230 16843.78
6 2 202307 NA 0.5890 1.67 98.3630 17435.83
final
1 NA
2 34144.44
3 34424.19
4 NA
5 34569.78
6 35161.83

  1. <details>
  2. <summary>英文:</summary>
  3. Assuming there is always a first month that has a *sale_count* per *MDM_Key*

library(dplyr)

merge(s, cf) %>%
arrange(MDM_Key, month_id) %>%
mutate(ML.coef = if_else(!is.na(sale_count), NA, ML.coef),
percent = iek_max_discount * ML.coef * 100,
prop = (percent / 100) * sale_count[1],
final = sale_count[1] + prop, .by = MDM_Key)
MDM_Key month_id sale_count iek_max_discount ML.coef percent prop
1 1 202305 19161 0.5256 NA NA NA
2 1 202306 NA 0.5356 1.46 78.1976 14983.44
3 1 202307 NA 0.5456 1.46 79.6576 15263.19
4 2 202305 17726 0.5590 NA NA NA
5 2 202306 NA 0.5690 1.67 95.0230 16843.78
6 2 202307 NA 0.5890 1.67 98.3630 17435.83
final
1 NA
2 34144.44
3 34424.19
4 NA
5 34569.78
6 35161.83

  1. </details>
  2. # 答案2
  3. **得分**: 1
  4. ``` r
  5. library(dplyr)
  6. cur_month <- as.integer(202305)
  7. s %>%
  8. right_join(cf, by = "MDM_Key") %>%
  9. mutate(percent = floor(if_else(month_id == cur_month,
  10. NA, ML.coef * iek_max_discount * 100)),
  11. prop = floor(percent * sale_count[month_id == cur_month]/100),
  12. final = sale_count[month_id == cur_month] + prop,
  13. .by = MDM_Key)
  1. #> month_id MDM_Key sale_count iek_max_discount ML.coef percent prop final
  2. #> 1 202306 1 NA 0.5356 1.46 78 14945 34106
  3. #> 2 202305 1 19161 0.5256 1.46 NA NA NA
  4. #> 3 202307 1 NA 0.5456 1.46 79 15137 34298
  5. #> 4 202305 2 17726 0.5590 1.67 NA NA NA
  6. #> 5 202306 2 NA 0.5690 1.67 95 16839 34565
  7. #> 6 202307 2 NA 0.5890 1.67 98 17371 35097
英文:
  1. library(dplyr)
  2. cur_month &lt;- as.integer(202305)
  3. s %&gt;%
  4. right_join(cf, by = &quot;MDM_Key&quot;) %&gt;%
  5. mutate(percent = floor(if_else(month_id == cur_month,
  6. NA, ML.coef * iek_max_discount * 100)),
  7. prop = floor(percent * sale_count[month_id == cur_month]/100),
  8. final = sale_count[month_id == cur_month] + prop,
  9. .by = MDM_Key)
  1. #&gt; month_id MDM_Key sale_count iek_max_discount ML.coef percent prop final
  2. #&gt; 1 202306 1 NA 0.5356 1.46 78 14945 34106
  3. #&gt; 2 202305 1 19161 0.5256 1.46 NA NA NA
  4. #&gt; 3 202307 1 NA 0.5456 1.46 79 15137 34298
  5. #&gt; 4 202305 2 17726 0.5590 1.67 NA NA NA
  6. #&gt; 5 202306 2 NA 0.5690 1.67 95 16839 34565
  7. #&gt; 6 202307 2 NA 0.5890 1.67 98 17371 35097

答案3

得分: 0

这是翻译后的代码部分:

  1. 这个部分有点难以理解你尝试做什么。以下是我最佳猜测:
  2. # 合并它们以使键在同一行上
  3. scf <- merge(s, cf, by = "MDM_Key")
  4. # 按每个唯一的MDM_Key拆分为列表
  5. s_by_key <- split(scf, s$MDM_Key)
  6. # 遍历列表中的每个元素
  7. for(i in 1:length(s_by_key)){
  8. # 截断以去除小数部分
  9. s_by_key[[i]]$percent <- trunc(s_by_key[[i]]$iek_max_discount * s_by_key[[i]]$ML.coef * 100)
  10. # 获取最近的销售月份
  11. months_with_sales <- which(!is.na(s_by_key[[i]]$sale_count))
  12. latest_month_with_sales <- which.max(s_by_key[[i]]$month_id[months_with_sales])
  13. last_sales <- s_by_key[[i]][latest_month_with_sales, "sale_count"]
  14. s_by_key[[i]]$prop <- trunc(last_sales / 100 * s_by_key[[i]]$percent)
  15. s_by_key[[i]]$final <- last_sales + s_by_key[[i]]$prop
  16. # 并将这些行设置为NA,以符合所需的输出
  17. s_by_key[[i]][latest_month_with_sales, c("percent","prop","final")] <- NA
  18. }
  19. output <- do.call(rbind, s_by_key)
  20. output
  21. # MDM_Key month_id sale_count iek_max_discount ML.coef percent prop final
  22. #1.1 1 202306 NA 0.5356 1.46 78 14945 34106
  23. #1.2 1 202305 19161 0.5256 1.46 NA NA NA
  24. #1.3 1 202307 NA 0.5456 1.46 79 15137 34298
  25. #2.4 2 202305 17726 0.5590 1.67 NA NA NA
  26. #2.5 2 202306 NA 0.5690 1.67 95 16839 34565
  27. #2.6 2 202307 NA 0.5890 1.67 98 17371 35097

希望这可以帮助你理解代码的内容。

英文:

It was a little hard to understand what you were trying to do. Here is my best guess:

  1. # merge them together to have the key in the same row
  2. scf &lt;- merge(s, cf, by = &quot;MDM_Key&quot;)
  3. # split into lists by each unique MDM_Key
  4. s_by_key &lt;- split(scf, s$MDM_Key)
  5. # loop through each element of the list
  6. for(i in 1:length(s_by_key)){
  7. # trunc to remove the decimals
  8. s_by_key[[i]]$percent &lt;- trunc(s_by_key[[i]]$iek_max_discount * s_by_key[[i]]$ML.coef * 100)
  9. # get the most recent sales month
  10. months_with_sales &lt;- which(!is.na(s_by_key[[i]]$sale_count))
  11. latest_month_with_sales &lt;- which.max(s_by_key[[i]]$month_id[months_with_sales])
  12. last_sales &lt;- s_by_key[[i]][latest_month_with_sales, &quot;sale_count&quot;]
  13. s_by_key[[i]]$prop &lt;- trunc(last_sales / 100 * s_by_key[[i]]$percent)
  14. s_by_key[[i]]$final &lt;- last_sales + s_by_key[[i]]$prop
  15. # and set those rows to NA to be like the desired output
  16. s_by_key[[i]][latest_month_with_sales, c(&quot;percent&quot;,&quot;prop&quot;,&quot;final&quot;)] &lt;- NA
  17. }
  18. output &lt;- do.call(rbind, s_by_key)
  19. output
  20. # MDM_Key month_id sale_count iek_max_discount ML.coef percent prop final
  21. #1.1 1 202306 NA 0.5356 1.46 78 14945 34106
  22. #1.2 1 202305 19161 0.5256 1.46 NA NA NA
  23. #1.3 1 202307 NA 0.5456 1.46 79 15137 34298
  24. #2.4 2 202305 17726 0.5590 1.67 NA NA NA
  25. #2.5 2 202306 NA 0.5690 1.67 95 16839 34565
  26. #2.6 2 202307 NA 0.5890 1.67 98 17371 35097

huangapple
  • 本文由 发表于 2023年5月6日 19:03:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76188516.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定