在R数据表中向后填充最后的NA值。

huangapple go评论75阅读模式
英文:

Backwards fill last instances of NAs in R Data Table

问题

library(data.table)

dt <- data.table(V1 = c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12))
dt[, rleid := rleid(V1)]
dt[, num := seq_len(.N), by = rleid]
dt[, max_num := max(num, na.rm = TRUE), by = rleid]

# Fill NAs based on condition
dt[, V2 := ifelse(is.na(V1) & num <= 2, V1, NA), by = rleid]

# Fill NAs forward
dt[, V2 := nafill(V2, type = "locf"), by = rleid]

# Remove the temporary columns
dt[, c("rleid", "num", "max_num") := NULL]

# Result
result <- dt$V2
result

这个代码会更快地填充NA值,得到你期望的结果。

英文:

I have a data.table with a column like:

c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)

I would like to fill only the two NAs before each non-NA value in the column by carrying backwards the next non-NA value. The result should be :

c(58,NA,NA,13,13,13,NA,12,12,12,23,12,12)

I have managed to do it with :

dt = data.table(V1 = c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12))
dt[, rleid:=rleid(dt$V1)]
dt[, num := seq(.N), rleid]

u=1
arr = c()
for (i in 1:(nrow(dt)-1)){
  if(dt$rleid[i] == dt$rleid[i+1]){
    u=u+1
    next
  }
  else{
    arr = append(arr,u)}
  u=1
}
arr=append(arr,1)

v=c()
for (i in 1:(length(arr))){
  for (j in 1:arr[i]){
    v=append(v,arr[i])
  }
}

dt[, len:=v]
dt[, val:=len-num]
dt[, V2 := fifelse(is.na(V1) &amp; val&lt;=1, nafill(V1, &quot;nocb&quot;), V1)]

This solution takes too long for a big data table. Any suggestions that are faster ?

答案1

得分: 4

以下是翻译好的部分:

A quick and dirty data.table solution:

快速且不太规范的 data.table 解决方案:

dt[, V1b := fcoalesce(c(list(V1), shift(V1, -(1:2))))]

Or simply (as suggested by B. Christian Kamgang)

或者简单地(如B. Christian Kamgang建议的)

dt[, V1b := fcoalesce(shift(V1, -(0:2)))]

  V1   V1b
&lt;num&gt; &lt;num&gt;

1: 58 58
2: NA NA
3: NA NA
4: NA 13
5: NA 13
6: 13 13
7: NA NA
8: NA 12
9: NA 12
10: 12 12
11: 23 23
12: NA 12
13: 12 12

英文:

A quick and dirty data.table solution:

dt[, V1b := fcoalesce(c(list(V1), shift(V1, -(1:2))))]

# Or simply (as suggested by B. Christian Kamgang)
dt[, V1b := fcoalesce(shift(V1, -(0:2)))]


      V1   V1b
    &lt;num&gt; &lt;num&gt;
 1:    58    58
 2:    NA    NA
 3:    NA    NA
 4:    NA    13
 5:    NA    13
 6:    13    13
 7:    NA    NA
 8:    NA    12
 9:    NA    12
10:    12    12
11:    23    23
12:    NA    12
13:    12    12

答案2

得分: 2

你可以修改提供给你的函数这里,通过两次反转向量来实现,即:

na_locf_max_backwards <- function(x, nmax){
  x <- rev(x)
  s <- split(x, cumsum(!is.na(x)))
  l <- mapply(\(x, y) {
        x[1:nmax+1] <- x[1]
        length(x) <- y
        x
      }, s, lengths(s))
  x <- unlist(l, use.names = FALSE)
  x <- rev(x)
  x
}

na_locf_max_backwards(c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12), nmax = 2)
# [1] 58 NA NA 13 13 13 NA 12 12 12 23 12 12
英文:

You can modify the function given to you here by reversing the vector twice, i.e.

na_locf_max_backwards &lt;- function(x, nmax){
  x &lt;- rev(x)
  s &lt;- split(x, cumsum(!is.na(x)))
  l &lt;- mapply(\(x, y) {
        x[1:nmax+1] &lt;- x[1]
        length(x) &lt;- y
        x
      }, s, lengths(s))
  x &lt;- unlist(l, use.names = FALSE)
  x &lt;- rev(x)
  x
}

na_locf_max_backwards(c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12), nmax = 2)
# [1] 58 NA NA 13 13 13 NA 12 12 12 23 12 12

答案3

得分: 1

另一个 data.table 解决方案:

dt[, V2 := fifelse(rev(rowid(rev(rleid(V1)))) <= 2, nafill(V1, "nocb"), V1)]
	   V1    V2
 1:    58    58
 2:    NA    NA
 3:    NA    NA
 4:    NA    13
 5:    NA    13
 6:    13    13
 7:    NA    NA
 8:    NA    12
 9:    NA    12
10:    12    12
11:    23    23
12:    NA    12
13:    12    12
英文:

Another data.table solution:

dt[, V2 := fifelse(rev(rowid(rev(rleid(V1))))&lt;=2, nafill(V1, &quot;nocb&quot;), V1)]

	   V1    V2
 1:    58    58
 2:    NA    NA
 3:    NA    NA
 4:    NA    13
 5:    NA    13
 6:    13    13
 7:    NA    NA
 8:    NA    12
 9:    NA    12
10:    12    12
11:    23    23
12:    NA    12
13:    12    12

huangapple
  • 本文由 发表于 2023年6月19日 21:54:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76507315.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定