2023年5月6日 19:03:48go评论140阅读模式

英文:

How to multiply the coefficients for each group and then calculate the percentage of the original value in R

问题

以下是您要翻译的内容：

我有一个数据集，类似下面的玩具数据集：
       s=structure(list(month_id = c(202306L, 202305L, 202307L, 202305L, 
    202306L, 202307L), MDM_Key = c(1L, 1L, 1L, 2L, 2L, 2L), sale_count = c(NA, 
    19161L, NA, 17726L, NA, NA), iek_max_discount = c(0.5356, 0.5256, 
    0.5456, 0.559, 0.569, 0.589)), class = "data.frame", row.names = c(NA, 
    -6L))
和另一个具有系数的数据集
    cf=structure(list(MDM_Key = 1:2, ML.coef = c(1.46, 1.67)), class = "data.frame", row.names = c(NA, 
    -2L))
'iek_max_discount' 是一个百分比值，即不是0.5356，而是53.56
我需要将每个`mdm_key`（来自`cf`的分组变量）的每个系数乘以最后一个当前月份（来自`s`数据集）的相应`iek_max_discount`（对应于`mdm_key`的）值。
例如，今天的当前月份是五月。这意味着对于`mdm_key =1`，我们取系数1.46并乘以六月的值（`month_id=202306`和`1,46*53,56=78`）。
得到的值也是一个百分比；现在，让我们计算78%的初始值`sale_count`是多少（它也只针对当前月份）。 `19161/100*78=14945`。
然后将结果添加到19161中，并将结果插入到六月份。
类似地，我们对七月份（`month_id=202307`）进行同样的操作；乘以系数`1,46*54,56=79`。计算比例：`19161/100*79=15137`
我们将得到的值添加到19161中，并将结果插入到七月份。
例如，对于mdm_key=1的期望输出
    month_id	MDM_Key	sale_count	iek_max_discount	percent	prop	final
    202306	         1		         0,5356	                 78	14945	34106
    202305	         1	19161	     0,5256			
    202307	         1		         0,5456	                 79	15137	34298
经过相同的过程，我们对mdm_key=2（或任何其他mdm_key，如果有的话）进行操作。
执行这样一系列算术操作的最佳和最简单的方法是什么？

英文:

I have a dataset like the toy dataset below:

   s=structure(list(month_id = c(202306L, 202305L, 202307L, 202305L, 
202306L, 202307L), MDM_Key = c(1L, 1L, 1L, 2L, 2L, 2L), sale_count = c(NA, 
19161L, NA, 17726L, NA, NA), iek_max_discount = c(0.5356, 0.5256, 
0.5456, 0.559, 0.569, 0.589)), class = &quot;data.frame&quot;, row.names = c(NA, 
-6L))

and another dataset with coefficients

cf=structure(list(MDM_Key = 1:2, ML.coef = c(1.46, 1.67)), class = &quot;data.frame&quot;, row.names = c(NA, 
-2L))

iek_max_discount is a percentage value, that is, not 0.5356, but 53.56

I need each coefficient of each mdm_key(group variable, from cf) multiply by the corresponding iek_max_discount(to the corresponding mdm_key) value from the last current month (from s dataset).

For example, today the current month is May. This means that for mdm_key =1, we take the coefficient 1.46 and multiply by the value of June (month_id=202306 and 1,46*53,56=78).

The resulting value is also a percentage; now, let's calculate what proportion is 78% of the initial valuesale_count (it is also presented only for the current month). 19161/100*78=14945.

Then the result must be added to 19161 and sum is inserted into the month of June.

Similarly, we do this for the month of July (month_id=202307); multiply by the coefficient 1,46*54,56=79. Calculate proportion: 19161/100*79=15137

We add the resulting value to 19161 and insert the result into the month of July.

For example desired output for mdm_key=1

month_id	MDM_Key	sale_count	iek_max_discount	percent	prop	final
202306	         1		         0,5356	                 78	14945	34106
202305	         1	19161	     0,5256			
202307	         1		         0,5456	                 79	15137	34298

After the same procedure we do for mdm_key=2 (or any other mdm_key, if any.)

What is the best and easiest way to do such a sequence of arithmetic operations?

答案1

得分: 1

假设总是有一个第一个月，每个 *MDM_Key* 有一个 *sale_count*

library(dplyr)

merge(s, cf) %>%
arrange(MDM_Key, month_id) %>%
mutate(ML.coef = if_else(!is.na(sale_count), NA, ML.coef),
percent = iek_max_discount * ML.coef * 100,
prop = (percent / 100) * sale_count[1],
final = sale_count[1] + prop, .by = MDM_Key)
MDM_Key month_id sale_count iek_max_discount ML.coef percent prop
1 1 202305 19161 0.5256 NA NA NA
2 1 202306 NA 0.5356 1.46 78.1976 14983.44
3 1 202307 NA 0.5456 1.46 79.6576 15263.19
4 2 202305 17726 0.5590 NA NA NA
5 2 202306 NA 0.5690 1.67 95.0230 16843.78
6 2 202307 NA 0.5890 1.67 98.3630 17435.83
final
1 NA
2 34144.44
3 34424.19
4 NA
5 34569.78
6 35161.83


<details>
<summary>英文:</summary>
Assuming there is always a first month that has a *sale_count* per *MDM_Key*

library(dplyr)


</details>
# 答案2
**得分**: 1
``` r
library(dplyr)
cur_month <- as.integer(202305)
s %>%
  right_join(cf, by = "MDM_Key") %>%
  mutate(percent = floor(if_else(month_id == cur_month,
                                  NA, ML.coef * iek_max_discount * 100)),
         prop = floor(percent * sale_count[month_id == cur_month]/100),
         final = sale_count[month_id == cur_month] + prop,
         .by = MDM_Key)

#>   month_id MDM_Key sale_count iek_max_discount ML.coef percent  prop final
#> 1   202306       1         NA           0.5356    1.46      78 14945 34106
#> 2   202305       1      19161           0.5256    1.46      NA    NA    NA
#> 3   202307       1         NA           0.5456    1.46      79 15137 34298
#> 4   202305       2      17726           0.5590    1.67      NA    NA    NA
#> 5   202306       2         NA           0.5690    1.67      95 16839 34565
#> 6   202307       2         NA           0.5890    1.67      98 17371 35097

英文:

library(dplyr)
cur_month &lt;- as.integer(202305)
s %&gt;% 
  right_join(cf, by = &quot;MDM_Key&quot;) %&gt;% 
  mutate(percent = floor(if_else(month_id == cur_month, 
                           NA, ML.coef * iek_max_discount * 100)),
         prop = floor(percent * sale_count[month_id == cur_month]/100),
         final = sale_count[month_id == cur_month] + prop, 
         .by = MDM_Key)

#&gt;   month_id MDM_Key sale_count iek_max_discount ML.coef percent  prop final
#&gt; 1   202306       1         NA           0.5356    1.46      78 14945 34106
#&gt; 2   202305       1      19161           0.5256    1.46      NA    NA    NA
#&gt; 3   202307       1         NA           0.5456    1.46      79 15137 34298
#&gt; 4   202305       2      17726           0.5590    1.67      NA    NA    NA
#&gt; 5   202306       2         NA           0.5690    1.67      95 16839 34565
#&gt; 6   202307       2         NA           0.5890    1.67      98 17371 35097

答案3

得分: 0

这是翻译后的代码部分：

这个部分有点难以理解你尝试做什么。以下是我最佳猜测：
# 合并它们以使键在同一行上
scf <- merge(s, cf, by = "MDM_Key")
# 按每个唯一的MDM_Key拆分为列表
s_by_key <- split(scf, s$MDM_Key)
# 遍历列表中的每个元素
for(i in 1:length(s_by_key)){
  # 截断以去除小数部分
  s_by_key[[i]]$percent <- trunc(s_by_key[[i]]$iek_max_discount * s_by_key[[i]]$ML.coef * 100)
  # 获取最近的销售月份
  months_with_sales <- which(!is.na(s_by_key[[i]]$sale_count))
  latest_month_with_sales <- which.max(s_by_key[[i]]$month_id[months_with_sales])
  last_sales <- s_by_key[[i]][latest_month_with_sales, "sale_count"]
  s_by_key[[i]]$prop <- trunc(last_sales / 100 * s_by_key[[i]]$percent)
  s_by_key[[i]]$final <- last_sales + s_by_key[[i]]$prop
  # 并将这些行设置为NA，以符合所需的输出
  s_by_key[[i]][latest_month_with_sales, c("percent","prop","final")] <- NA
}
output <- do.call(rbind, s_by_key)
output
#    MDM_Key month_id sale_count iek_max_discount ML.coef percent  prop final
#1.1       1   202306         NA           0.5356    1.46      78 14945 34106
#1.2       1   202305      19161           0.5256    1.46      NA    NA    NA
#1.3       1   202307         NA           0.5456    1.46      79 15137 34298
#2.4       2   202305      17726           0.5590    1.67      NA    NA    NA
#2.5       2   202306         NA           0.5690    1.67      95 16839 34565
#2.6       2   202307         NA           0.5890    1.67      98 17371 35097

希望这可以帮助你理解代码的内容。

英文:

It was a little hard to understand what you were trying to do. Here is my best guess:

# merge them together to have the key in the same row
scf &lt;- merge(s, cf, by = &quot;MDM_Key&quot;)
# split into lists by each unique MDM_Key
s_by_key &lt;- split(scf, s$MDM_Key)
# loop through each element of the list
for(i in 1:length(s_by_key)){
  # trunc to remove the decimals
  s_by_key[[i]]$percent &lt;- trunc(s_by_key[[i]]$iek_max_discount *  s_by_key[[i]]$ML.coef * 100)
  
  # get the most recent sales month
  months_with_sales &lt;- which(!is.na(s_by_key[[i]]$sale_count))
  latest_month_with_sales &lt;- which.max(s_by_key[[i]]$month_id[months_with_sales])
  
  last_sales &lt;- s_by_key[[i]][latest_month_with_sales, &quot;sale_count&quot;]
  s_by_key[[i]]$prop &lt;- trunc(last_sales / 100 * s_by_key[[i]]$percent)
  s_by_key[[i]]$final &lt;- last_sales + s_by_key[[i]]$prop
  
  # and set those rows to NA to be like the desired output
  s_by_key[[i]][latest_month_with_sales, c(&quot;percent&quot;,&quot;prop&quot;,&quot;final&quot;)] &lt;- NA
  
}
output &lt;- do.call(rbind, s_by_key)
output
#    MDM_Key month_id sale_count iek_max_discount ML.coef percent  prop final
#1.1       1   202306         NA           0.5356    1.46      78 14945 34106
#1.2       1   202305      19161           0.5256    1.46      NA    NA    NA
#1.3       1   202307         NA           0.5456    1.46      79 15137 34298
#2.4       2   202305      17726           0.5590    1.67      NA    NA    NA
#2.5       2   202306         NA           0.5690    1.67      95 16839 34565
#2.6       2   202307         NA           0.5890    1.67      98 17371 35097

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中如何对每个组的系数进行乘法运算，然后计算原始值的百分比。

问题

答案1

答案3

Unreadable test with list of encoding in pandas

基于数值分组转换pandas列值

根据另一列中的重复数值和日期，合并两列中的数值。

从在基准实验中使用的重采样中获取重采样索引 – mlr

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。