增加一个值,如果一行中的数字发生变化。

huangapple go评论54阅读模式
英文:

Increase a value if a number in a row changes

问题

我尝试使用mutate()来增加一列中的值,如果另一行中的值发生变化,则重置为1,就像以下示例一样:

col1 col2 count
0    1    1 
0    1    1
0    2    2
0    3    3
1    4    1
1    5    2
1    5    2

在第一部分中,当row1中的值变化时,它运行良好,但第二部分中row2中的变化值不起作用。我只获得以下结果:

col1 col2 count
0    1    1 
0    1    2
0    2    3
0    3    4
1    4    1
1    5    2
1    5    3

这是我的有效代码:

df1 <- df %>%
  group_by(col1, col2) %>%
  mutate(counter = row_number()) %>%
  ungroup

我已经尝试过以下代码:

df1 <- df %>%
  group_by(col1) %>%
  mutate(counter = row_number()) %>%
  group_by(col2) %>%
  mutate(counter = 'failed_code') %>%
  ungroup

但是使用if_elsecase_when等函数与给定的参数无法工作。我如何实现对col2的计数器,仅在行发生变化时增加,如果col1变化,则重置为1

英文:

I'm trying to use mutate() to increase a value in a column if a value changes in another row and resets to 1 if a value changes in a third row like the following example:

col1 col2 count
0    1    1 
0    1    1
0    2    2
0    3    3
1    4    1
1    5    2
1    5    2

The part with changes in row1 works well but the second part with the changing values in row2 didn't work. I only get the following results:

col1 col2 count
0    1    1 
0    1    2
0    2    3
0    3    4
1    4    1
1    5    2
1    5    3

This is my working code:

df1 &lt;- df %&gt;%
  group_by(col1, col2)%&gt;%
  mutate(counter=row_number())%&gt;%
  ungroup

I already tried this:

df1 &lt;- df %&gt;%
  group_by(col1)%&gt;%
  mutate(counter=row_number())%&gt;%
  group_by(col2)%&gt;%
  mutate(counter= &#39;failed_code&#39;)%&gt;%
  ungroup

but using functions like if_else or case_when didn't work with my given arguments. How could I implement a counter for col2 which increases only if the rows changes and reset to 1 if col1 changes?

答案1

得分: 4

使用consecutive_id(在dplyr >= 1.1.0中引入)可以这样做:

library(dplyr, warn=FALSE)

dat <- data.frame(
  col1 = c(0, 0, 0, 0, 1, 1, 1),
  col2 = c(1, 1, 2, 3, 4, 5, 5)
) 

dat |>
  mutate(count = consecutive_id(col2), .by = col1)
#>   col1 col2 count
#> 1    0    1     1
#> 2    0    1     1
#> 3    0    2     2
#> 4    0    3     3
#> 5    1    4     1
#> 6    1    5     2
#> 7    1    5     2

请注意,我只翻译了代码部分,不包括注释和输出。

英文:

Using consecutive_id (introduced with dplyr &gt;= 1.1.0) you could do:

library(dplyr, warn=FALSE)

dat &lt;- data.frame(
  col1 = c(0, 0, 0, 0, 1, 1, 1),
  col2 = c(1, 1, 2, 3, 4, 5, 5)
) 

dat |&gt; 
  mutate(count = consecutive_id(col2), .by = col1)
#&gt;   col1 col2 count
#&gt; 1    0    1     1
#&gt; 2    0    1     1
#&gt; 3    0    2     2
#&gt; 4    0    3     3
#&gt; 5    1    4     1
#&gt; 6    1    5     2
#&gt; 7    1    5     2

答案2

得分: 2

使用 data.table 你可以使用 rleid

df <- structure(list(col1 = c(0L, 0L, 0L, 0L, 1L, 1L, 1L), col2 = c(1L, 
1L, 2L, 3L, 4L, 5L, 5L)), class = "data.frame", row.names = c(NA, 
-7L))
require(data.table)
setDT(df)
df[,count:=rleid(col2), by = col1]
df
#   col1 col2 count
#1:    0    1     1
#2:    0    1     1
#3:    0    2     2
#4:    0    3     3
#5:    1    4     1
#6:    1    5     2
#7:    1    5     2
英文:

With data.table you can use rleid:

df &lt;- structure(list(col1 = c(0L, 0L, 0L, 0L, 1L, 1L, 1L), col2 = c(1L, 
1L, 2L, 3L, 4L, 5L, 5L)), class = &quot;data.frame&quot;, row.names = c(NA, 
-7L))
require(data.table)
setDT(df)
df[,count:=rleid(col2), by = col1]
df
#   col1 col2 count
#1:    0    1     1
#2:    0    1     1
#3:    0    2     2
#4:    0    3     3
#5:    1    4     1
#6:    1    5     2
#7:    1    5     2

huangapple
  • 本文由 发表于 2023年6月5日 23:59:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76408161.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定