条件分数化

huangapple go评论111阅读模式
英文:

Conditional fractioning

问题

ID Country Sales fraction
1 奥地利 6 0.666
1 奥地利 6 0.666
1 比利时 6 0.333
2 比利时 10 0.5
2 捷克 10 0.5
3 丹麦 3 1
3 德国 3 1
英文:

Suppose I have the dataset

ID Country Sales
1 Austria 6
1 Austria 6
1 Belgium 6
2 Belgium 10
2 Czech 10
3 Denmark 3
3 Germany 3

I want to another variable of sales which depends on countries and their ID ie in fractions.

ID Country Sales fraction
1 Austria 6 0.666
1 Austria 6 0.666
1 Belgium 6 0.333
2 Belgium 10 0.5
2 Czech 10 0.5
3 Denmark 3 1
3 Denmark 3 1

Any help would be appreciated!

答案1

得分: 1

  1. library(dplyr)
  2. your_data |>
  3. mutate(country_total = sum(Sales), .by = c(ID, Country)) |>
  4. mutate(fraction = country_total / sum(Sales), .by = ID)
  5. # ID Country Sales country_total fraction
  6. # 1 1 Austria 6 12 0.6666667
  7. # 2 1 Austria 6 12 0.6666667
  8. # 3 1 Belgium 6 6 0.3333333
  9. # 4 2 Belgium 10 10 0.5000000
  10. # 5 2 Czech 10 10 0.5000000
  11. # 6 3 Denmark 3 3 0.5000000
  12. # 7 3 Germany 3 3 0.5000000

使用此示例数据:

  1. your_data = read.table(text = 'ID Country Sales
  2. 1 Austria 6
  3. 1 Austria 6
  4. 1 Belgium 6
  5. 2 Belgium 10
  6. 2 Czech 10
  7. 3 Denmark 3
  8. 3 Germany 3', header = T)
英文:
  1. library(dplyr)
  2. your_data |>
  3. mutate(country_total = sum(Sales), .by = c(ID, Country)) |>
  4. mutate(fraction = country_total / sum(Sales), .by = ID)
  5. # ID Country Sales country_total fraction
  6. # 1 1 Austria 6 12 0.6666667
  7. # 2 1 Austria 6 12 0.6666667
  8. # 3 1 Belgium 6 6 0.3333333
  9. # 4 2 Belgium 10 10 0.5000000
  10. # 5 2 Czech 10 10 0.5000000
  11. # 6 3 Denmark 3 3 0.5000000
  12. # 7 3 Germany 3 3 0.5000000

Using this sample data:

  1. your_data = read.table(text = 'ID Country Sales
  2. 1 Austria 6
  3. 1 Austria 6
  4. 1 Belgium 6
  5. 2 Belgium 10
  6. 2 Czech 10
  7. 3 Denmark 3
  8. 3 Germany 3', header = T)

答案2

得分: 1

Here is the translated content:

"或者,我们可以使用 add_count 来获得相同的结果

  1. library(dplyr)
  2. df %>% add_count(ID,name = 'n') %>% add_count(ID,Country, name = 'gn') %>%
  3. mutate(new=gn/n) %>% select(-c(n,gn))
  4. # 输出
  5. # A tibble: 7 × 4
  6. ID Country Sales new
  7. <dbl> <chr> <dbl> <dbl>
  8. 1 1 Austria 6 0.667
  9. 2 1 Austria 6 0.667
  10. 3 1 Belgium 6 0.333
  11. 4 2 Belgium 10 0.5
  12. 5 2 Czech 10 0.5
  13. 6 3 Denmark 3 0.5
  14. 7 3 Germany 3 0.5
  15. ```"
  16. <details>
  17. <summary>英文:</summary>
  18. Alternatively we could use `add_count` to get the same result
  19. ````r
  20. library(dplyr)
  21. df %&gt;% add_count(ID,name = &#39;n&#39;) %&gt;% add_count(ID,Country, name = &#39;gn&#39;) %&gt;%
  22. mutate(new=gn/n) %&gt;% select(-c(n,gn))
  23. # output
  24. # A tibble: 7 &#215; 4
  25. ID Country Sales new
  26. &lt;dbl&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
  27. 1 1 Austria 6 0.667
  28. 2 1 Austria 6 0.667
  29. 3 1 Belgium 6 0.333
  30. 4 2 Belgium 10 0.5
  31. 5 2 Czech 10 0.5
  32. 6 3 Denmark 3 0.5
  33. 7 3 Germany 3 0.5

答案3

得分: 0

  1. ID Country Sales fraction
  2. 1 1 Austria 6 0.6666667
  3. 2 1 Austria 6 0.6666667
  4. 3 1 Belgium 6 0.3333333
  5. 4 2 Belgium 10 0.5000000
  6. 5 2 Czech 10 0.5000000
  7. 6 3 Denmark 3 0.5000000
  8. 7 3 Germany 3 0.5000000
英文:

Base

  1. &gt; df$fraction=ave(df$Sales,list(df$ID,df$Country),FUN=sum)/ave(df$Sales,df$ID,FUN=sum)
  2. ID Country Sales fraction
  3. 1 1 Austria 6 0.6666667
  4. 2 1 Austria 6 0.6666667
  5. 3 1 Belgium 6 0.3333333
  6. 4 2 Belgium 10 0.5000000
  7. 5 2 Czech 10 0.5000000
  8. 6 3 Denmark 3 0.5000000
  9. 7 3 Germany 3 0.5000000

huangapple
  • 本文由 发表于 2023年6月29日 21:48:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76581662.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定