如何根据累积值创建分组

huangapple go评论93阅读模式
英文:

How to create groups based on cumulative values

问题

I would like to set groups based on sum of columns

  1. ID Value Days
  2. A 0 5
  3. A 1200 20
  4. A 300 40
  5. A 1000 67
  6. A 1000 90
  7. A 1000 110
  8. A 500 115
  9. B 0 5
  10. B 500 10
  11. B 600 15
  12. B 1000 25
  13. B 2000 50
  14. B 900 70
  15. B 1000 90

Output

  1. ID Group Days
  2. A Doses A 20
  3. A Doses B 90
  4. A Doses C 115
  5. B Doses A 15
  6. B Doses B 50
  7. B Doses C 70

I would like to sum values to create groups and retain the corresponding timestamps (days) when they occur.

  1. Output (creating 3 groups)
  2. Values
  3. (Doses A) => 1000
  4. (Doses B) => 3000
  5. (Doses C) => 5000
英文:

I would like to set groups based on sum of columns

  1. ID Value Days
  2. A 0 5
  3. A 1200 20
  4. A 300 40
  5. A 1000 67
  6. A 1000 90
  7. A 1000 110
  8. A 500 115
  9. B 0 5
  10. B 500 10
  11. B 600 15
  12. B 1000 25
  13. B 2000 50
  14. B 900 70
  15. B 1000 90

Output

  1. ID Group Days
  2. A Doses A 20
  3. A Doses B 90
  4. A Doses C 115
  5. B Doses A 15
  6. B Doses B 50
  7. B Doses C 70

I would like to sum values to create groups and retain the corresponding timestamps (days) when they occur.

  1. Output (creating 3 groups)
  2. Values
  3. (Doses A) => 1000
  4. (Doses B) => 3000
  5. (Doses C) => 5000

答案1

得分: 2

  1. library(tidyverse)
  2. df %>%
  3. mutate(Value = cumsum(Value),
  4. Doses = (Value >= 1000) + (Value >= 3000) + (Value >= 5000), .by = ID) %>%
  5. filter(Value >= 1000) %>%
  6. slice_head(by = c(ID, Doses)) %>%
  7. mutate(Doses = paste('Doses', c('A', 'B', 'C')[Doses])) %>%
  8. select(ID, Doses, Days)
英文:
  1. library(tidyverse)
  2. df %>%
  3. mutate(Value = cumsum(Value),
  4. Doses = (Value>=1000) + (Value>=3000) + (Value>=5000), .by=ID)%>%
  5. filter(Value>=1000) %>%
  6. slice_head(by = c(ID, Doses))%>%
  7. mutate(Doses = paste('Doses', c('A', 'B', 'C')[Doses])) %>%
  8. select(ID, Doses, Days)
  9. #> # A tibble: 6 × 3
  10. #> ID Doses Days
  11. #> <chr> <chr> <dbl>
  12. #> 1 A Doses A 20
  13. #> 2 A Doses B 90
  14. #> 3 A Doses C 115
  15. #> 4 B Doses A 15
  16. #> 5 B Doses B 50
  17. #> 6 B Doses C 70

<sup>Created on 2023-06-28 with reprex v2.0.2</sup>

答案2

得分: 1

  1. 请尝试以下代码
  2. ```r
  3. library(dplyr)
  4. df %>% group_by(ID) %>% mutate(new=cumsum(Value),
  5. new2=case_when(
  6. new>=5000 ~ 'C',
  7. new>=3000 ~ 'B',
  8. new>=1000 ~ 'A'
  9. )) %>%
  10. filter(!is.na(new2)) %>%
  11. group_by(ID,new2) %>%
  12. slice_head(n=1) %>%
  13. ungroup() %>%
  14. select(-c(new,new2))
  15. # 输出
  16. # 一个 tibble: 6 × 3
  17. ID Value Days
  18. <chr> <dbl> <dbl>
  19. 1 A 1200 20
  20. 2 A 1000 90
  21. 3 A 500 115
  22. 4 B 600 15
  23. 5 B 2000 50
  24. 6 B 900 70
  1. <details>
  2. <summary>英文:</summary>
  3. Please try the below code
  4. ```r
  5. library(dplyr)
  6. df %&gt;% group_by(ID) %&gt;% mutate(new=cumsum(Value),
  7. new2=case_when(
  8. new&gt;=5000 ~ &#39;C&#39;,
  9. new&gt;=3000 ~ &#39;B&#39;,
  10. new&gt;=1000 ~ &#39;A&#39;
  11. )) %&gt;%
  12. filter(!is.na(new2)) %&gt;%
  13. group_by(ID,new2) %&gt;%
  14. slice_head(n=1) %&gt;% ungroup() %&gt;%
  15. select(-c(new,new2))
  16. # output
  17. # A tibble: 6 &#215; 3
  18. ID Value Days
  19. &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
  20. 1 A 1200 20
  21. 2 A 1000 90
  22. 3 A 500 115
  23. 4 B 600 15
  24. 5 B 2000 50
  25. 6 B 900 70

huangapple
  • 本文由 发表于 2023年6月29日 05:51:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76576923.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定