如何根据累积值创建分组

huangapple go评论62阅读模式
英文:

How to create groups based on cumulative values

问题

I would like to set groups based on sum of columns

ID Value Days
A 0   5
A 1200  20
A 300   40 
A 1000  67    
A 1000  90
A 1000  110
A 500   115
B 0     5
B 500   10
B 600   15 
B 1000  25
B 2000  50
B 900   70
B 1000  90

Output

ID Group Days
A Doses A 20
A Doses B 90
A Doses C 115
B Doses A 15
B Doses B 50
B Doses C 70

I would like to sum values to create groups and retain the corresponding timestamps (days) when they occur.

Output (creating 3 groups)
            Values
(Doses A) => 1000 
(Doses B) => 3000 
(Doses C) => 5000 
英文:

I would like to set groups based on sum of columns

ID Value Days
A 0   5
A 1200  20
A 300   40 
A 1000  67    
A 1000  90
A 1000  110
A 500   115
B 0     5
B 500   10
B 600   15 
B 1000  25
B 2000  50
B 900   70
B 1000  90

Output

ID Group Days
A Doses A 20
A Doses B 90
A Doses C 115
B Doses A 15
B Doses B 50
B Doses C 70

I would like to sum values to create groups and retain the corresponding timestamps (days) when they occur.

Output (creating 3 groups)
            Values
(Doses A) => 1000 
(Doses B) => 3000 
(Doses C) => 5000 

答案1

得分: 2

library(tidyverse)
df %>%
  mutate(Value = cumsum(Value),
         Doses = (Value >= 1000) + (Value >= 3000) + (Value >= 5000), .by = ID) %>%
  filter(Value >= 1000) %>%
  slice_head(by = c(ID, Doses)) %>%
  mutate(Doses = paste('Doses', c('A', 'B', 'C')[Doses])) %>%
  select(ID, Doses, Days)
英文:
library(tidyverse)
df %>%
  mutate(Value = cumsum(Value),
         Doses = (Value>=1000) + (Value>=3000) + (Value>=5000), .by=ID)%>%
  filter(Value>=1000) %>%
  slice_head(by = c(ID, Doses))%>%
  mutate(Doses = paste('Doses', c('A', 'B', 'C')[Doses])) %>%
  select(ID, Doses, Days)

#> # A tibble: 6 × 3
#>   ID    Doses    Days
#>   <chr> <chr>   <dbl>
#> 1 A     Doses A    20
#> 2 A     Doses B    90
#> 3 A     Doses C   115
#> 4 B     Doses A    15
#> 5 B     Doses B    50
#> 6 B     Doses C    70

<sup>Created on 2023-06-28 with reprex v2.0.2</sup>

答案2

得分: 1

请尝试以下代码

```r
library(dplyr)

df %>% group_by(ID) %>% mutate(new=cumsum(Value),
                              new2=case_when(
                                new>=5000 ~ 'C',
                                new>=3000 ~ 'B',
                                new>=1000 ~ 'A'
                              )) %>%
  filter(!is.na(new2)) %>%
  group_by(ID,new2) %>%
  slice_head(n=1) %>%
  ungroup() %>%
  select(-c(new,new2))

# 输出

# 一个 tibble: 6 × 3
  ID    Value  Days
  <chr> <dbl> <dbl>
1 A     1200    20
2 A     1000    90
3 A      500   115
4 B      600    15
5 B     2000    50
6 B      900    70

<details>
<summary>英文:</summary>

Please try the below code 


```r
library(dplyr)

df %&gt;% group_by(ID) %&gt;% mutate(new=cumsum(Value),
                              new2=case_when(
                                new&gt;=5000 ~ &#39;C&#39;,
                                new&gt;=3000 ~ &#39;B&#39;,
                                new&gt;=1000 ~ &#39;A&#39;
                              )) %&gt;% 
  filter(!is.na(new2)) %&gt;% 
  group_by(ID,new2) %&gt;% 
  slice_head(n=1) %&gt;% ungroup() %&gt;% 
  select(-c(new,new2))

# output 

# A tibble: 6 &#215; 3
  ID    Value  Days
  &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
1 A      1200    20
2 A      1000    90
3 A       500   115
4 B       600    15
5 B      2000    50
6 B       900    70

huangapple
  • 本文由 发表于 2023年6月29日 05:51:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76576923.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定