2023年3月10日 01:13:45go评论94阅读模式

英文:

Count rows after value 1 where rest of rows are NAs in r

问题

我想要填充1出现后的缺失值，用计数来表示。像这样：

df2 <- tibble(a = c(NA, NA, 1, 2, 3, 4, 1, 2, 3))

到目前为止，我还没有找到解决方案。有什么想法吗？

英文:

I have this data:

df &lt;- tibble(a = c(NA, NA, 1, NA, NA, NA, 1, NA, NA))

I want to fill the NAs after the occurence of 1 with counts. Like this:

df2 &lt;- tibble(a = c(NA, NA, 1, 2, 3, 4, 1, 2, 3))

I've no solution so far. Any ideas?

答案1

得分: 2

使用 tidyverse 中的一种可能方法解决这个问题：

library(dplyr)
df %>%
  # 返回 TRUE 或 FALSE（1 或 0），如果不是 NA，则运行累积和以识别组
  dplyr::mutate(grp = cumsum(!is.na(a))) %>%
  # 构建分组
  dplyr::group_by(grp) %>%
  # 如果组不等于 0（第一次直到 a = 1 的第一行），为每个组提供行号
  dplyr::transmute(a = ifelse(grp != 0, dplyr::row_number(), NA)) %>%
  # 解除分组，以防止下游出现不需要的行为
  dplyr::ungroup() %>%
  # 如果在后续计算中不需要 grp，则取消选择 grp
  dplyr::select(-grp)
  
# 一个 tibble: 9 x 1
      a
  <int>
1    NA
2    NA
3     1
4     2
5     3
6     4
7     1
8     2
9     3

英文:

one possible way to solve this within the tidyverse:

library(dplyr)
df %&gt;% 
    # return TRUE or FALSE (1 or 0) if is not NA and run cummulative sum to identify groups
    dplyr::mutate(grp = cumsum(!is.na(a))) %&gt;% 
    # build grouping
    dplyr::group_by(grp) %&gt;% 
    # give rownumber per group if group != 0 (first rows until a = 1 for the frist time
    dplyr::transmute(a = ifelse(grp != 0, dplyr::row_number(), NA)) %&gt;%  
    # release groupings to prevent unwanted behaviour down stream
    dplyr::ungroup() %&gt;% 
    # unselect grp if you do not need it further on in your calculations
    dplyr::select(-grp)
# A tibble: 9 x 1
      a
  &lt;int&gt;
1    NA
2    NA
3     1
4     2
5     3
6     4
7     1
8     2
9     3

答案2

得分: 1

使用`data.table`

library(data.table)
setDT(df)[, a2 := seq_len(.N) * NA^(all(is.na(a))), cumsum(!is.na(a))]

-输出

> df
a a2
1: NA NA
2: NA NA
3: 1 1
4: NA 2
5: NA 3
6: NA 4
7: 1 1
8: NA 2
9: NA 3

英文:

Using data.table

library(data.table)
setDT(df)[, a2 := seq_len(.N) * NA^(all(is.na(a))), cumsum(!is.na(a))]

-output

&gt; df
    a a2
1: NA NA
2: NA NA
3:  1  1
4: NA  2
5: NA  3
6: NA  4
7:  1  1
8: NA  2
9: NA  3

答案3

得分: 0

使用 sequence 函数：

idx = which(df$a == 1)
df$a[idx[1]:length(df$a)] <- sequence(diff(c(idx, length(df$a) + 1)))
#df$a
#[1] NA NA  1  2  3  4  1  2  3

英文:

With sequence:

idx = which(df$a == 1)
df$a[idx[1]:length(df$a)] &lt;- sequence(diff(c(idx, length(df$a) + 1)))
#df$a
#[1] NA NA  1  2  3  4  1  2  3

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中，统计数值为1之后，其余行为NA的行数。

问题

答案1

答案2

答案3

在图例中在“bquote”内添加一个字符

在R中尝试创建甘特图。

How to convert a numeric variable that counts the number of months since a certain point into a variable with an interpretable date (R)?

仅基于条件筛选重复的行。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。