英文:
Increase a value if a number in a row changes
问题
我尝试使用mutate()来增加一列中的值,如果另一行中的值发生变化,则重置为1,就像以下示例一样:
col1 col2 count
0    1    1 
0    1    1
0    2    2
0    3    3
1    4    1
1    5    2
1    5    2
在第一部分中,当row1中的值变化时,它运行良好,但第二部分中row2中的变化值不起作用。我只获得以下结果:
col1 col2 count
0    1    1 
0    1    2
0    2    3
0    3    4
1    4    1
1    5    2
1    5    3
这是我的有效代码:
df1 <- df %>%
  group_by(col1, col2) %>%
  mutate(counter = row_number()) %>%
  ungroup
我已经尝试过以下代码:
df1 <- df %>%
  group_by(col1) %>%
  mutate(counter = row_number()) %>%
  group_by(col2) %>%
  mutate(counter = 'failed_code') %>%
  ungroup
但是使用if_else或case_when等函数与给定的参数无法工作。我如何实现对col2的计数器,仅在行发生变化时增加,如果col1变化,则重置为1?
英文:
I'm trying to use mutate() to increase a value in a column if a value changes in another row and resets to 1 if a value changes in a third row like the following example:
col1 col2 count
0    1    1 
0    1    1
0    2    2
0    3    3
1    4    1
1    5    2
1    5    2
The part with changes in row1 works well but the second part with the changing values in row2 didn't work. I only get the following results:
col1 col2 count
0    1    1 
0    1    2
0    2    3
0    3    4
1    4    1
1    5    2
1    5    3
This is my working code:
df1 <- df %>%
  group_by(col1, col2)%>%
  mutate(counter=row_number())%>%
  ungroup
I already tried this:
df1 <- df %>%
  group_by(col1)%>%
  mutate(counter=row_number())%>%
  group_by(col2)%>%
  mutate(counter= 'failed_code')%>%
  ungroup
but using functions like if_else or case_when didn't work with my given arguments. How could I implement a counter for col2 which increases only if the rows changes and reset to 1 if col1 changes?
答案1
得分: 4
使用consecutive_id(在dplyr >= 1.1.0中引入)可以这样做:
library(dplyr, warn=FALSE)
dat <- data.frame(
  col1 = c(0, 0, 0, 0, 1, 1, 1),
  col2 = c(1, 1, 2, 3, 4, 5, 5)
) 
dat |>
  mutate(count = consecutive_id(col2), .by = col1)
#>   col1 col2 count
#> 1    0    1     1
#> 2    0    1     1
#> 3    0    2     2
#> 4    0    3     3
#> 5    1    4     1
#> 6    1    5     2
#> 7    1    5     2
请注意,我只翻译了代码部分,不包括注释和输出。
英文:
Using consecutive_id (introduced with dplyr >= 1.1.0) you could do:
library(dplyr, warn=FALSE)
dat <- data.frame(
  col1 = c(0, 0, 0, 0, 1, 1, 1),
  col2 = c(1, 1, 2, 3, 4, 5, 5)
) 
dat |> 
  mutate(count = consecutive_id(col2), .by = col1)
#>   col1 col2 count
#> 1    0    1     1
#> 2    0    1     1
#> 3    0    2     2
#> 4    0    3     3
#> 5    1    4     1
#> 6    1    5     2
#> 7    1    5     2
答案2
得分: 2
使用 data.table 你可以使用 rleid:
df <- structure(list(col1 = c(0L, 0L, 0L, 0L, 1L, 1L, 1L), col2 = c(1L, 
1L, 2L, 3L, 4L, 5L, 5L)), class = "data.frame", row.names = c(NA, 
-7L))
require(data.table)
setDT(df)
df[,count:=rleid(col2), by = col1]
df
#   col1 col2 count
#1:    0    1     1
#2:    0    1     1
#3:    0    2     2
#4:    0    3     3
#5:    1    4     1
#6:    1    5     2
#7:    1    5     2
英文:
With data.table you can use rleid:
df <- structure(list(col1 = c(0L, 0L, 0L, 0L, 1L, 1L, 1L), col2 = c(1L, 
1L, 2L, 3L, 4L, 5L, 5L)), class = "data.frame", row.names = c(NA, 
-7L))
require(data.table)
setDT(df)
df[,count:=rleid(col2), by = col1]
df
#   col1 col2 count
#1:    0    1     1
#2:    0    1     1
#3:    0    2     2
#4:    0    3     3
#5:    1    4     1
#6:    1    5     2
#7:    1    5     2
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论