英文:
Forward fill first instances of NAs in R data.table
问题
我有一个数据表,其中有一列如下:
c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)
我想要仅填充在该列中每个非NA值之后的前两个NA值,填充值应该是前一个非NA值。结果应该如下:
c(58,58,58,NA,NA,13,13,13,NA,12,23,23,12)
有什么建议吗?
英文:
I have a data.table with a column like:
c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)
I would like to fill only the first two NAs following each non-NA value in the column by forwarding the last previous value. The result should be :
c(58,58,58,NA,NA,13,13,13,NA,12,23,23,12)
Any suggestions ?
答案1
得分: 1
一种基于基本R的简单方法:
```r
na_locf_max <- function(x, nmax){
# 将向量拆分为数字和NA的序列
s <- split(x, cumsum(!is.na(x)))
# 将第一个nmax个值分配给第一个值,并修剪以匹配长度
l <- mapply(\(x, y) {
x[1:nmax+1] <- x[1]
length(x) <- y
x
}, s, lengths(s))
# 格式化为向量格式
unlist(l, use.names = FALSE)
}
na_locf_max(x, nmax = 2)
# [1] 58 58 58 NA NA 13 13 13 NA 12 23 23 12
<details>
<summary>英文:</summary>
One crude way in base R:
```r
na_locf_max <- function(x, nmax){
# Split the vector in sequence of numbers and NAs
s <- split(x, cumsum(!is.na(x)))
# Assign the first nmax value to the first value, and trim to match length
l <- mapply(\(x, y) {
x[1:nmax+1] <- x[1]
length(x) <- y
x
}, s, lengths(s))
#Format to vector format
unlist(l, use.names = FALSE)
}
na_locf_max(x, nmax = 2)
# [1] 58 58 58 NA NA 13 13 13 NA 12 23 23 12
答案2
得分: 0
A data.table solution using rleid
for grouping and shift
to access the numbers.
library(data.table)
dt[, .(V1, grp = rleid(V1), V1shift = shift(V1, 1)),][
, .(V1, ifelse(is.na(V1shift), shift(V1shift, 1), V1shift)), by = grp][
, .(V1, res = ifelse(is.na(V1), V2, V1)),]
V1 res
1: 58 58
2: NA 58
3: NA 58
4: NA NA
5: NA NA
6: 13 13
7: NA 13
8: NA 13
9: NA NA
10: 12 12
11: 23 23
12: NA 23
13: 12 12
Data
dt <- structure(list(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12,
23, NA, 12)), row.names = c(NA, -13L), class = c("data.table",
"data.frame"))
英文:
A data.table solution using rleid
for grouping and shift
to access the numbers.
library(data.table)
dt[, .(V1, grp = rleid(V1), V1shift = shift(V1, 1)),][
, .(V1, ifelse(is.na(V1shift), shift(V1shift, 1), V1shift)), by = grp][
, .(V1, res = ifelse(is.na(V1), V2, V1)),]
V1 res
1: 58 58
2: NA 58
3: NA 58
4: NA NA
5: NA NA
6: 13 13
7: NA 13
8: NA 13
9: NA NA
10: 12 12
11: 23 23
12: NA 23
13: 12 12
Data
dt <- structure(list(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12,
23, NA, 12)), row.names = c(NA, -13L), class = c("data.table",
"data.frame"))
答案3
得分: 0
以下是翻译好的代码部分:
library(data.table)
dt = data.table(V1 = c(58, NA, NA, NA, NA, 13, NA, NA, NA, 12, 23, NA, 12))
dt[, V2 := fifelse(is.na(V1) & rowid(rleid(V1)) <= 2, nafill(V1, "locf"), V1)]
希望这对你有帮助。
英文:
One possible way to solve your problem:
library(data.table)
dt = data.table(V1 = c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12))
dt[, V2 := fifelse(is.na(V1) & rowid(rleid(V1))<=2, nafill(V1, "locf"), V1)]
V1 V2
1: 58 58
2: NA 58
3: NA 58
4: NA NA
5: NA NA
6: 13 13
7: NA 13
8: NA 13
9: NA NA
10: 12 12
11: 23 23
12: NA 23
13: 12 12
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论