Forward fill first instances of NAs in R data.table

huangapple go评论67阅读模式
英文:

Forward fill first instances of NAs in R data.table

问题

我有一个数据表,其中有一列如下:

c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)

我想要仅填充在该列中每个非NA值之后的前两个NA值,填充值应该是前一个非NA值。结果应该如下:

c(58,58,58,NA,NA,13,13,13,NA,12,23,23,12)

有什么建议吗?

英文:

I have a data.table with a column like:

c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)

I would like to fill only the first two NAs following each non-NA value in the column by forwarding the last previous value. The result should be :

c(58,58,58,NA,NA,13,13,13,NA,12,23,23,12)

Any suggestions ?

答案1

得分: 1

一种基于基本R的简单方法:
```r
na_locf_max <- function(x, nmax){
  # 将向量拆分为数字和NA的序列
  s <- split(x, cumsum(!is.na(x)))

  # 将第一个nmax个值分配给第一个值,并修剪以匹配长度
  l <- mapply(\(x, y) {
    x[1:nmax+1] <- x[1]
    length(x) <- y
    x
  }, s, lengths(s))

  # 格式化为向量格式
  unlist(l, use.names = FALSE)
}

na_locf_max(x, nmax = 2)
# [1] 58 58 58 NA NA 13 13 13 NA 12 23 23 12

<details>
<summary>英文:</summary>

One crude way in base R:
```r
na_locf_max &lt;- function(x, nmax){
  # Split the vector in sequence of numbers and NAs
  s &lt;- split(x, cumsum(!is.na(x)))

  # Assign the first nmax value to the first value, and trim to match length
  l &lt;- mapply(\(x, y) {
    x[1:nmax+1] &lt;- x[1]
    length(x) &lt;- y
    x
  }, s, lengths(s))

  #Format to vector format
  unlist(l, use.names = FALSE)
}

na_locf_max(x, nmax = 2)
# [1] 58 58 58 NA NA 13 13 13 NA 12 23 23 12

答案2

得分: 0

A data.table solution using rleid for grouping and shift to access the numbers.

library(data.table)

dt[, .(V1, grp = rleid(V1), V1shift = shift(V1, 1)),][
   , .(V1, ifelse(is.na(V1shift), shift(V1shift, 1), V1shift)), by = grp][
   , .(V1, res = ifelse(is.na(V1), V2, V1)),]
    V1 res
 1: 58  58
 2: NA  58
 3: NA  58
 4: NA  NA
 5: NA  NA
 6: 13  13
 7: NA  13
 8: NA  13
 9: NA  NA
10: 12  12
11: 23  23
12: NA  23
13: 12  12

Data

dt <- structure(list(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12, 
23, NA, 12)), row.names = c(NA, -13L), class = c("data.table", 
"data.frame"))
英文:

A data.table solution using rleid for grouping and shift to access the numbers.

library(data.table)

dt[, .(V1, grp = rleid(V1), V1shift = shift(V1, 1)),][
   , .(V1, ifelse(is.na(V1shift), shift(V1shift, 1), V1shift)), by = grp][
   , .(V1, res = ifelse(is.na(V1), V2, V1)),]
    V1 res
 1: 58  58
 2: NA  58
 3: NA  58
 4: NA  NA
 5: NA  NA
 6: 13  13
 7: NA  13
 8: NA  13
 9: NA  NA
10: 12  12
11: 23  23
12: NA  23
13: 12  12

Data

dt &lt;- structure(list(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12, 
23, NA, 12)), row.names = c(NA, -13L), class = c(&quot;data.table&quot;, 
&quot;data.frame&quot;))

答案3

得分: 0

以下是翻译好的代码部分:

library(data.table)
dt = data.table(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12, 23, NA, 12))

dt[, V2 := fifelse(is.na(V1) & rowid(rleid(V1)) <= 2, nafill(V1, "locf"), V1)]

希望这对你有帮助。

英文:

One possible way to solve your problem:

library(data.table)
dt = data.table(V1 = c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12))

dt[, V2 := fifelse(is.na(V1) &amp; rowid(rleid(V1))&lt;=2, nafill(V1, &quot;locf&quot;), V1)]

	   V1    V2
 1:    58    58
 2:    NA    58
 3:    NA    58
 4:    NA    NA
 5:    NA    NA
 6:    13    13
 7:    NA    13
 8:    NA    13
 9:    NA    NA
10:    12    12
11:    23    23
12:    NA    23
13:    12    12

huangapple
  • 本文由 发表于 2023年6月15日 20:30:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76482499.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定