Forward fill first instances of NAs in R data.table

huangapple go评论88阅读模式
英文:

Forward fill first instances of NAs in R data.table

问题

我有一个数据表,其中有一列如下:

c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)

我想要仅填充在该列中每个非NA值之后的前两个NA值,填充值应该是前一个非NA值。结果应该如下:

c(58,58,58,NA,NA,13,13,13,NA,12,23,23,12)

有什么建议吗?

英文:

I have a data.table with a column like:

c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)

I would like to fill only the first two NAs following each non-NA value in the column by forwarding the last previous value. The result should be :

c(58,58,58,NA,NA,13,13,13,NA,12,23,23,12)

Any suggestions ?

答案1

得分: 1

  1. 一种基于基本R的简单方法:
  2. ```r
  3. na_locf_max <- function(x, nmax){
  4. # 将向量拆分为数字和NA的序列
  5. s <- split(x, cumsum(!is.na(x)))
  6. # 将第一个nmax个值分配给第一个值,并修剪以匹配长度
  7. l <- mapply(\(x, y) {
  8. x[1:nmax+1] <- x[1]
  9. length(x) <- y
  10. x
  11. }, s, lengths(s))
  12. # 格式化为向量格式
  13. unlist(l, use.names = FALSE)
  14. }
  15. na_locf_max(x, nmax = 2)
  16. # [1] 58 58 58 NA NA 13 13 13 NA 12 23 23 12
  1. <details>
  2. <summary>英文:</summary>
  3. One crude way in base R:
  4. ```r
  5. na_locf_max &lt;- function(x, nmax){
  6. # Split the vector in sequence of numbers and NAs
  7. s &lt;- split(x, cumsum(!is.na(x)))
  8. # Assign the first nmax value to the first value, and trim to match length
  9. l &lt;- mapply(\(x, y) {
  10. x[1:nmax+1] &lt;- x[1]
  11. length(x) &lt;- y
  12. x
  13. }, s, lengths(s))
  14. #Format to vector format
  15. unlist(l, use.names = FALSE)
  16. }
  17. na_locf_max(x, nmax = 2)
  18. # [1] 58 58 58 NA NA 13 13 13 NA 12 23 23 12

答案2

得分: 0

A data.table solution using rleid for grouping and shift to access the numbers.

  1. library(data.table)
  2. dt[, .(V1, grp = rleid(V1), V1shift = shift(V1, 1)),][
  3. , .(V1, ifelse(is.na(V1shift), shift(V1shift, 1), V1shift)), by = grp][
  4. , .(V1, res = ifelse(is.na(V1), V2, V1)),]
  5. V1 res
  6. 1: 58 58
  7. 2: NA 58
  8. 3: NA 58
  9. 4: NA NA
  10. 5: NA NA
  11. 6: 13 13
  12. 7: NA 13
  13. 8: NA 13
  14. 9: NA NA
  15. 10: 12 12
  16. 11: 23 23
  17. 12: NA 23
  18. 13: 12 12

Data

  1. dt <- structure(list(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12,
  2. 23, NA, 12)), row.names = c(NA, -13L), class = c("data.table",
  3. "data.frame"))
英文:

A data.table solution using rleid for grouping and shift to access the numbers.

  1. library(data.table)
  2. dt[, .(V1, grp = rleid(V1), V1shift = shift(V1, 1)),][
  3. , .(V1, ifelse(is.na(V1shift), shift(V1shift, 1), V1shift)), by = grp][
  4. , .(V1, res = ifelse(is.na(V1), V2, V1)),]
  5. V1 res
  6. 1: 58 58
  7. 2: NA 58
  8. 3: NA 58
  9. 4: NA NA
  10. 5: NA NA
  11. 6: 13 13
  12. 7: NA 13
  13. 8: NA 13
  14. 9: NA NA
  15. 10: 12 12
  16. 11: 23 23
  17. 12: NA 23
  18. 13: 12 12

Data

  1. dt &lt;- structure(list(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12,
  2. 23, NA, 12)), row.names = c(NA, -13L), class = c(&quot;data.table&quot;,
  3. &quot;data.frame&quot;))

答案3

得分: 0

以下是翻译好的代码部分:

  1. library(data.table)
  2. dt = data.table(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12, 23, NA, 12))
  3. dt[, V2 := fifelse(is.na(V1) & rowid(rleid(V1)) <= 2, nafill(V1, "locf"), V1)]

希望这对你有帮助。

英文:

One possible way to solve your problem:

  1. library(data.table)
  2. dt = data.table(V1 = c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12))
  3. dt[, V2 := fifelse(is.na(V1) &amp; rowid(rleid(V1))&lt;=2, nafill(V1, &quot;locf&quot;), V1)]
  4. V1 V2
  5. 1: 58 58
  6. 2: NA 58
  7. 3: NA 58
  8. 4: NA NA
  9. 5: NA NA
  10. 6: 13 13
  11. 7: NA 13
  12. 8: NA 13
  13. 9: NA NA
  14. 10: 12 12
  15. 11: 23 23
  16. 12: NA 23
  17. 13: 12 12

huangapple
  • 本文由 发表于 2023年6月15日 20:30:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76482499.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定