在R数据表中向后填充最后的NA值。

huangapple go评论94阅读模式
英文:

Backwards fill last instances of NAs in R Data Table

问题

  1. library(data.table)
  2. dt <- data.table(V1 = c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12))
  3. dt[, rleid := rleid(V1)]
  4. dt[, num := seq_len(.N), by = rleid]
  5. dt[, max_num := max(num, na.rm = TRUE), by = rleid]
  6. # Fill NAs based on condition
  7. dt[, V2 := ifelse(is.na(V1) & num <= 2, V1, NA), by = rleid]
  8. # Fill NAs forward
  9. dt[, V2 := nafill(V2, type = "locf"), by = rleid]
  10. # Remove the temporary columns
  11. dt[, c("rleid", "num", "max_num") := NULL]
  12. # Result
  13. result <- dt$V2
  14. result

这个代码会更快地填充NA值,得到你期望的结果。

英文:

I have a data.table with a column like:

c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)

I would like to fill only the two NAs before each non-NA value in the column by carrying backwards the next non-NA value. The result should be :

c(58,NA,NA,13,13,13,NA,12,12,12,23,12,12)

I have managed to do it with :

  1. dt = data.table(V1 = c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12))
  2. dt[, rleid:=rleid(dt$V1)]
  3. dt[, num := seq(.N), rleid]
  4. u=1
  5. arr = c()
  6. for (i in 1:(nrow(dt)-1)){
  7. if(dt$rleid[i] == dt$rleid[i+1]){
  8. u=u+1
  9. next
  10. }
  11. else{
  12. arr = append(arr,u)}
  13. u=1
  14. }
  15. arr=append(arr,1)
  16. v=c()
  17. for (i in 1:(length(arr))){
  18. for (j in 1:arr[i]){
  19. v=append(v,arr[i])
  20. }
  21. }
  22. dt[, len:=v]
  23. dt[, val:=len-num]
  24. dt[, V2 := fifelse(is.na(V1) &amp; val&lt;=1, nafill(V1, &quot;nocb&quot;), V1)]

This solution takes too long for a big data table. Any suggestions that are faster ?

答案1

得分: 4

以下是翻译好的部分:

A quick and dirty data.table solution:

快速且不太规范的 data.table 解决方案:

dt[, V1b := fcoalesce(c(list(V1), shift(V1, -(1:2))))]

Or simply (as suggested by B. Christian Kamgang)

或者简单地(如B. Christian Kamgang建议的)

dt[, V1b := fcoalesce(shift(V1, -(0:2)))]

  1. V1 V1b
  2. &lt;num&gt; &lt;num&gt;

1: 58 58
2: NA NA
3: NA NA
4: NA 13
5: NA 13
6: 13 13
7: NA NA
8: NA 12
9: NA 12
10: 12 12
11: 23 23
12: NA 12
13: 12 12

英文:

A quick and dirty data.table solution:

  1. dt[, V1b := fcoalesce(c(list(V1), shift(V1, -(1:2))))]
  2. # Or simply (as suggested by B. Christian Kamgang)
  3. dt[, V1b := fcoalesce(shift(V1, -(0:2)))]
  4. V1 V1b
  5. &lt;num&gt; &lt;num&gt;
  6. 1: 58 58
  7. 2: NA NA
  8. 3: NA NA
  9. 4: NA 13
  10. 5: NA 13
  11. 6: 13 13
  12. 7: NA NA
  13. 8: NA 12
  14. 9: NA 12
  15. 10: 12 12
  16. 11: 23 23
  17. 12: NA 12
  18. 13: 12 12

答案2

得分: 2

你可以修改提供给你的函数这里,通过两次反转向量来实现,即:

  1. na_locf_max_backwards <- function(x, nmax){
  2. x <- rev(x)
  3. s <- split(x, cumsum(!is.na(x)))
  4. l <- mapply(\(x, y) {
  5. x[1:nmax+1] <- x[1]
  6. length(x) <- y
  7. x
  8. }, s, lengths(s))
  9. x <- unlist(l, use.names = FALSE)
  10. x <- rev(x)
  11. x
  12. }
  13. na_locf_max_backwards(c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12), nmax = 2)
  14. # [1] 58 NA NA 13 13 13 NA 12 12 12 23 12 12
英文:

You can modify the function given to you here by reversing the vector twice, i.e.

  1. na_locf_max_backwards &lt;- function(x, nmax){
  2. x &lt;- rev(x)
  3. s &lt;- split(x, cumsum(!is.na(x)))
  4. l &lt;- mapply(\(x, y) {
  5. x[1:nmax+1] &lt;- x[1]
  6. length(x) &lt;- y
  7. x
  8. }, s, lengths(s))
  9. x &lt;- unlist(l, use.names = FALSE)
  10. x &lt;- rev(x)
  11. x
  12. }
  13. na_locf_max_backwards(c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12), nmax = 2)
  14. # [1] 58 NA NA 13 13 13 NA 12 12 12 23 12 12

答案3

得分: 1

另一个 data.table 解决方案:

  1. dt[, V2 := fifelse(rev(rowid(rev(rleid(V1)))) <= 2, nafill(V1, "nocb"), V1)]
  1. V1 V2
  2. 1: 58 58
  3. 2: NA NA
  4. 3: NA NA
  5. 4: NA 13
  6. 5: NA 13
  7. 6: 13 13
  8. 7: NA NA
  9. 8: NA 12
  10. 9: NA 12
  11. 10: 12 12
  12. 11: 23 23
  13. 12: NA 12
  14. 13: 12 12
英文:

Another data.table solution:

  1. dt[, V2 := fifelse(rev(rowid(rev(rleid(V1))))&lt;=2, nafill(V1, &quot;nocb&quot;), V1)]
  2. V1 V2
  3. 1: 58 58
  4. 2: NA NA
  5. 3: NA NA
  6. 4: NA 13
  7. 5: NA 13
  8. 6: 13 13
  9. 7: NA NA
  10. 8: NA 12
  11. 9: NA 12
  12. 10: 12 12
  13. 11: 23 23
  14. 12: NA 12
  15. 13: 12 12

huangapple
  • 本文由 发表于 2023年6月19日 21:54:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76507315.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定