如何在R中执行复杂的算术操作,从观测值之间减去指定百分比。

huangapple go评论86阅读模式
英文:

How to subtract the specified percentage between observations to perform complex arithmetic operations in R

问题

我需要为每个mdm组从d1数据集中减去perc列中指定的百分比值,其中perc列= 100(100始终是起始值)。例如,对于mdm=7,perc=100,其中price=77.8。接下来的perc值是99,即比77.8少1%,因此从77.8中减去1%得到77.022。perc=85,这意味着从起始点77.8中减去15% = 66.13,perc=50,这意味着从起始点减去50%。

同样,我需要将百分比相加,例如101,这意味着从价格=77.8中加1%(即78.578),从而形成价格列等等。

此外,具有mdm组的d2数据集中还有elast列的值,该值乘以100的下一个百分比。例如,当perc=99对于mdm=7时,值1.5必须乘以1(100*1.5=101.5),当perc=70时,乘以1.5的剩余百分比,即30*1.5(100-70=30)=45,然后加到100上得到145,依此类推。

最后一步对于d2数据集中的每个mdm是成本价。这意味着从已经形成的price列中减去成本值,例如,对于mdm=7,cost=24从价格(38.9-24=14.9)中减去,此值乘以count列中的值,即在这种情况下为175。这个操作创建了一个新的列profit=14.9\*175=2607

在这个可复制的示例中,我已经为样本填充了price列。在原始数据中,这个表格看起来像这样(实际上是d1数据集中的所需输出)。

初始数据如下:

  1. mdm perc price count
  2. 1 7 50 NA NA
  3. 2 7 60 NA NA
  4. 3 7 70 NA NA
  5. 4 7 80 NA NA
  6. 5 7 85 NA NA
  7. 6 7 90 NA NA
  8. 7 7 95 NA NA
  9. 8 7 96 NA NA
  10. 9 7 97 NA NA
  11. 10 7 98 NA NA
  12. 11 7 99 NA NA
  13. 12 7 100 77.8 100
  14. 13 7 101 NA NA
  15. 14 7 102 NA NA
  16. 15 7 103 NA NA
  17. 16 7 104 NA NA
  18. 17 7 105 NA NA
  19. 18 7 110 NA NA
  20. 19 7 115 NA NA
  21. 20 7 120 NA NA
  22. 21 7 130 NA NA
  23. 22 7 140 NA NA
  24. 23 7 150 NA NA
  25. 24 8 50 NA NA
  26. 25 8 60 NA NA
  27. 26 8 70 NA NA
  28. 27 8 80 NA NA
  29. 28 8 85 NA NA
  30. 29 8 90 NA NA
  31. 30 8 95 NA NA
  32. 31 8 96 NA NA
  33. 32 8 97 NA NA
  34. 33 8 98 NA NA
  35. 34 8 99 NA NA
  36. 35 8 100 77.8 100
  37. 36 8 101 NA NA
  38. 37 8 102 NA NA
  39. 38 8 103 NA NA
  40. 39 8 104 NA NA
  41. 40 8 105 NA NA
  42. 41 8 110 NA NA
  43. 42 8 115 NA NA
  44. 43 8 120 NA NA
  45. 44 8 130 NA NA
  46. 45 8 140 NA NA
  47. 46 8 150 NA NA

感谢您的宝贵帮助。

英文:

I have 2 datasets

  1. d1=structure(list(mdm = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
  2. 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L,
  3. 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
  4. 8L, 8L, 8L, 8L), perc = c(50L, 60L, 70L, 80L, 85L, 90L, 95L,
  5. 96L, 97L, 98L, 99L, 100L, 101L, 102L, 103L, 104L, 105L, 110L,
  6. 115L, 120L, 130L, 140L, 150L, 50L, 60L, 70L, 80L, 85L, 90L, 95L,
  7. 96L, 97L, 98L, 99L, 100L, 101L, 102L, 103L, 104L, 105L, 110L,
  8. 115L, 120L, 130L, 140L, 150L), price = c(38.9, 46.68, 54.46,
  9. 62.24, 66.13, 70.02, 73.91, 74.688, 75.466, 76.244, 77.022, 77.8,
  10. 78.578, 79.356, 80.134, 80.912, 81.69, 85.58, 89.47, 93.36, 101.14,
  11. 108.92, 116.7, 38.9, 46.68, 54.46, 62.24, 66.13, 70.02, 73.91,
  12. 74.688, 75.466, 76.244, 77.022, 77.8, 78.578, 79.356, 80.134,
  13. 80.912, 81.69, 85.58, 89.47, 93.36, 101.14, 108.92, 116.7), count = c(175,
  14. 160, 145, 130, 122.5, 115, 107.5, 106, 104.5, 103, 101.5, 100,
  15. 98.5, 97, 95.5, 94, 92.5, 85, 77.5, 70, 55, 40, 25, 175, 160,
  16. 145, 130, 122.5, 115, 107.5, 106, 104.5, 103, 101.5, 100, 98.5,
  17. 97, 95.5, 94, 92.5, 85, 77.5, 70, 55, 40, 25), profit = c(2607.5,
  18. 3628.8, 4416.7, 4971.2, 5160.925, 5292.3, 5365.325, 5372.928,
  19. 5378.197, 5381.132, 5381.733, 5380, 5375.933, 5369.532, 5360.797,
  20. 5349.728, 5336.325, 5234.3, 5073.925, 4855.2, 4242.7, 3396.8,
  21. 2317.5, 2432.5, 3468.8, 4271.7, 4841.2, 5038.425, 5177.3, 5257.825,
  22. 5266.928, 5273.697, 5278.132, 5280.233, 5280, 5277.433, 5272.532,
  23. 5265.297, 5255.728, 5243.825, 5149.3, 4996.425, 4785.2, 4187.7,
  24. 3356.8, 2292.5)), class = "data.frame", row.names = c(NA, -46L
  25. ))

and second dataset represents the percentage values by which it is necessary to reduce or increase the value of price and count in d1 also d2 contains the cost column

  1. d2=structure(list(mdm = 7:8, elast = c(1.5, 1.5), cost = 24:25), class = "data.frame", row.names = c(NA,
  2. -2L))

I'm having some troubles with complex arithmetic and I need help.
I'll try to describe my question in more detail.

I need for each mdm group to subtract the percentage indicated in perc column from the price value, where the perc column = 100. (100 is always the start value.)
For example for mdm=7, perc=100 where price=77.8.
The next perc value is 99, i.e. less by 1, so subtract 1 percent from 77.8 and get 77.022. perc = 85, this means that from the starting point 77.8 we subtract 15% = 66.13, perc = 50, which means we subtract 50 percent from the starting point.
In a similar way, I need to add percentages up, for example 101, this means that 1 percent up from the price = 77.8 i.e. 78,578, thus forming the price column and so on.

Further, the price value with perc = 100 has a value in the count column, in this example it is also = 100 (but this is not always the case).

I need to take the value from the elast column for each mdm group from d2 dataset and this value is multiplied by the next percentage of 100. For example, where perc = 99 for mdm = 7, the value of 1.5 must be multiplied by 1 (100*1,5=101.5), where the value of perc = 70, then 30 * 1.5 (100-70=30) 30*1,5=45 100+45=145 and so on.

The last step for each mdm in data d2 there is the cost price. This means that from the already formed price column, we must subtract the cost value, for example, for mdm = 7, cost=24 from the price (38.9-24 = 14.9), this value is multiplied by the value in the count column, i.e. in this case 175. This action creates a new column profit=14.9*175=2607

In this reproducible example, the price column is all filled in by me for a sample. In raw data this table looks like this (indeed desired output in d1 dataset)
The initial data looks like this

  1. mdm perc price count
  2. 1 7 50 NA NA
  3. 2 7 60 NA NA
  4. 3 7 70 NA NA
  5. 4 7 80 NA NA
  6. 5 7 85 NA NA
  7. 6 7 90 NA NA
  8. 7 7 95 NA NA
  9. 8 7 96 NA NA
  10. 9 7 97 NA NA
  11. 10 7 98 NA NA
  12. 11 7 99 NA NA
  13. **12 7 100 77.8 100**
  14. 13 7 101 NA NA
  15. 14 7 102 NA NA
  16. 15 7 103 NA NA
  17. 16 7 104 NA NA
  18. 17 7 105 NA NA
  19. 18 7 110 NA NA
  20. 19 7 115 NA NA
  21. 20 7 120 NA NA
  22. 21 7 130 NA NA
  23. 22 7 140 NA NA
  24. 23 7 150 NA NA
  25. 24 8 50 NA NA
  26. 25 8 60 NA NA
  27. 26 8 70 NA NA
  28. 27 8 80 NA NA
  29. 28 8 85 NA NA
  30. 29 8 90 NA NA
  31. 30 8 95 NA NA
  32. 31 8 96 NA NA
  33. 32 8 97 NA NA
  34. 33 8 98 NA NA
  35. 34 8 99 NA NA
  36. **35 8 100 77.8 100**
  37. 36 8 101 NA NA
  38. 37 8 102 NA NA
  39. 38 8 103 NA NA
  40. 39 8 104 NA NA
  41. 40 8 105 NA NA
  42. 41 8 110 NA NA
  43. 42 8 115 NA NA
  44. 43 8 120 NA NA
  45. 44 8 130 NA NA
  46. 45 8 140 NA NA
  47. 46 8 150 NA NA

Thanks for your any valuable help.

答案1

得分: 3

以下是翻译好的部分:

"Here is one way:
First we join both dataframes,
then we define the rules as you describe in detail (therefore it is easy to translate to code :-).
I think most challenging and tricky thinking is to fix the price value at 100% -> in this case price[perc=100]. The rest is described by your fantastic explanation:

  1. "这是一种方法:首先,我们将两个数据框连接起来,然后根据您详细描述的规则进行定义(因此很容易转换成代码 :-)。我认为最具挑战性和复杂的思考是将价格值固定在100% -> 在这种情况下是 `price[perc=100]`。其余部分由您精彩的解释描述:"
  2. ```"
  3. "```"
  4. " mdm perc price count elast cost last_step
  5. <int> <int> <dbl> <dbl> <dbl> <int> <dbl>
  6. 1 7 50 38.9 175 1.5 24 2607.
  7. 2 7 60 46.7 160 1.5 24 3629.
  8. 3 7 70 54.5 145 1.5 24 4417.
  9. 4 7 80 62.2 130 1.5 24 4971.
  10. 5 7 85 66.1 122. 1.5 24 5161.
  11. 6 7 90 70.0 115 1.5 24 5292.
  12. 7 7 95 73.9 108. 1.5 24 5365.
  13. 8 7 96 74.7 106 1.5 24 5373.
  14. 9 7 97 75.5 104. 1.5 24 5378.
  15. 10 7 98 76.2 103 1.5 24 5381."
  16. "# … with 36 more rows
  17. # ℹ Use `print(n = ...)` to see more rows

希望这有助于您的理解。

英文:

Here is one way:
First we join both dataframes,
then we define the rules as you describe in detail (therefore it is easy to translate to code :-).
I think most challenging and tricky thinking is to fix the price value at 100% -> in this case price[perc=100]. The rest is described by your fantastic explanation:

  1. library(dplyr)
  2. df %>%
  3. left_join(d2, by="mdm") %>%
  4. group_by(mdm) %>%
  5. mutate(price = (price[perc==100]/100)*perc,
  6. count = (count[perc==100]+(elast* count[perc==100]-perc)),
  7. last_step = (price-cost)*count)
  1. mdm perc price count elast cost last_step
  2. <int> <int> <dbl> <dbl> <dbl> <int> <dbl>
  3. 1 7 50 38.9 175 1.5 24 2607.
  4. 2 7 60 46.7 160 1.5 24 3629.
  5. 3 7 70 54.5 145 1.5 24 4417.
  6. 4 7 80 62.2 130 1.5 24 4971.
  7. 5 7 85 66.1 122. 1.5 24 5161.
  8. 6 7 90 70.0 115 1.5 24 5292.
  9. 7 7 95 73.9 108. 1.5 24 5365.
  10. 8 7 96 74.7 106 1.5 24 5373.
  11. 9 7 97 75.5 104. 1.5 24 5378.
  12. 10 7 98 76.2 103 1.5 24 5381.
  13. # … with 36 more rows
  14. # ℹ Use `print(n = ...)` to see more rows

答案2

得分: 3

使用 data.table

  1. library(data.table)
  2. setDT(d1)[d2, c("price", "count", "cost") :=
  3. .((price[perc == 100]/100)*perc, count[perc == 100] +
  4. (elast* count[perc == 100]-perc), i.cost), on = .(mdm)]
  5. d1[, last_step := (price - cost) * count]

输出

  1. > head(d1)
  2. mdm perc price count profit cost last_step
  3. 1: 7 50 38.90 200 2607.500 24 2980.00
  4. 2: 7 60 46.68 190 3628.800 24 4309.20
  5. 3: 7 70 54.46 180 4416.700 24 5482.80
  6. 4: 7 80 62.24 170 4971.200 24 6500.80
  7. 5: 7 85 66.13 165 5160.925 24 6951.45
  8. 6: 7 90 70.02 160 5292.300 24 7363.20
英文:

Using data.table

  1. library(data.table)
  2. setDT(d1)[d2, c("price", "count", "cost") :=
  3. .((price[perc == 100]/100)*perc, count[perc == 100] +
  4. (elast* count[perc == 100]-perc), i.cost), on = .(mdm)]
  5. d1[, last_step := (price - cost) * count]

-output

  1. > head(d1)
  2. mdm perc price count profit cost last_step
  3. 1: 7 50 38.90 200 2607.500 24 2980.00
  4. 2: 7 60 46.68 190 3628.800 24 4309.20
  5. 3: 7 70 54.46 180 4416.700 24 5482.80
  6. 4: 7 80 62.24 170 4971.200 24 6500.80
  7. 5: 7 85 66.13 165 5160.925 24 6951.45
  8. 6: 7 90 70.02 160 5292.300 24 7363.20

答案3

得分: 2

以下是代码的翻译结果:

  1. d1 %>%
  2. group_by(mdm) %>%
  3. mutate(price = price[perc==100]*(1-(100-perc)/100)) %>%
  4. ungroup %>%
  5. inner_join(d2, by="mdm") %>%
  6. mutate(count = count[perc==100] + (100-perc)*elast, profit = count*(price-cost)) %>%
  7. select(-c(elast,cost))

无法提供翻译结果,因为这是R语言代码,无需翻译。

英文:

You should be able to produce d1 from the original frame as follows:

  1. d1 %>%
  2. group_by(mdm) %>%
  3. mutate(price = price[perc==100]*(1-(100-perc)/100)) %>%
  4. ungroup %>%
  5. inner_join(d2, by="mdm") %>%
  6. mutate(count = count[perc==100] + (100-perc)*elast, profit = count*(price-cost)) %>%
  7. select(-c(elast,cost))

Input:

  1. d1 = structure(list(mdm = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
  2. 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L,
  3. 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
  4. 8L, 8L, 8L, 8L), perc = c(50L, 60L, 70L, 80L, 85L, 90L, 95L,
  5. 96L, 97L, 98L, 99L, 100L, 101L, 102L, 103L, 104L, 105L, 110L,
  6. 115L, 120L, 130L, 140L, 150L, 50L, 60L, 70L, 80L, 85L, 90L, 95L,
  7. 96L, 97L, 98L, 99L, 100L, 101L, 102L, 103L, 104L, 105L, 110L,
  8. 115L, 120L, 130L, 140L, 150L), price = c(NA, NA, NA, NA, NA,
  9. NA, NA, NA, NA, NA, NA, 77.8, NA, NA, NA, NA, NA, NA, NA, NA,
  10. NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 77.8,
  11. NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), count = c(NA, NA,
  12. NA, NA, NA, NA, NA, NA, NA, NA, NA, 100, NA, NA, NA, NA, NA,
  13. NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
  14. NA, 100, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA,
  15. -46L))

Output:

  1. mdm perc price count profit
  2. 1 7 50 38.900 175.0 2607.500
  3. 2 7 60 46.680 160.0 3628.800
  4. 3 7 70 54.460 145.0 4416.700
  5. 4 7 80 62.240 130.0 4971.200
  6. 5 7 85 66.130 122.5 5160.925
  7. 6 7 90 70.020 115.0 5292.300
  8. 7 7 95 73.910 107.5 5365.325
  9. 8 7 96 74.688 106.0 5372.928
  10. 9 7 97 75.466 104.5 5378.197
  11. 10 7 98 76.244 103.0 5381.132
  12. 11 7 99 77.022 101.5 5381.733
  13. 12 7 100 77.800 100.0 5380.000
  14. 13 7 101 78.578 98.5 5375.933
  15. 14 7 102 79.356 97.0 5369.532
  16. 15 7 103 80.134 95.5 5360.797
  17. 16 7 104 80.912 94.0 5349.728
  18. 17 7 105 81.690 92.5 5336.325
  19. 18 7 110 85.580 85.0 5234.300
  20. 19 7 115 89.470 77.5 5073.925
  21. 20 7 120 93.360 70.0 4855.200
  22. 21 7 130 101.140 55.0 4242.700
  23. 22 7 140 108.920 40.0 3396.800
  24. 23 7 150 116.700 25.0 2317.500
  25. 24 8 50 38.900 175.0 2432.500
  26. 25 8 60 46.680 160.0 3468.800
  27. 26 8 70 54.460 145.0 4271.700
  28. 27 8 80 62.240 130.0 4841.200
  29. 28 8 85 66.130 122.5 5038.425
  30. 29 8 90 70.020 115.0 5177.300
  31. 30 8 95 73.910 107.5 5257.825
  32. 31 8 96 74.688 106.0 5266.928
  33. 32 8 97 75.466 104.5 5273.697
  34. 33 8 98 76.244 103.0 5278.132
  35. 34 8 99 77.022 101.5 5280.233
  36. 35 8 100 77.800 100.0 5280.000
  37. 36 8 101 78.578 98.5 5277.433
  38. 37 8 102 79.356 97.0 5272.532
  39. 38 8 103 80.134 95.5 5265.297
  40. 39 8 104 80.912 94.0 5255.728
  41. 40 8 105 81.690 92.5 5243.825
  42. 41 8 110 85.580 85.0 5149.300
  43. 42 8 115 89.470 77.5 4996.425
  44. 43 8 120 93.360 70.0 4785.200
  45. 44 8 130 101.140 55.0 4187.700
  46. 45 8 140 108.920 40.0 3356.800
  47. 46 8 150 116.700 25.0 2292.500

huangapple
  • 本文由 发表于 2023年2月18日 20:30:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75493346.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定