英文:
R imputation of missing value using autocorrelation
问题
我正在尝试填补两个缺失值。我的讲师建议使用与以下阶段的最大自相关性的结果:
1. 从数据集中找到最小和最大数据。假设我使用AirPassenger数据,省略两个数据。最小数据是104,最大数据是622。
2. 对两个缺失数据的每种组合进行计算,找到滞后1的自相关性(ACF)。通过将两个缺失数据替换为最小和最大范围内的数字(`104 <= x <= 622`)进行实验。
3. 根据最大自相关性选择缺失数据的插补。
4. 期望的输出是一个由插补的时间序列自相关性结果组成的矩阵。
我正在尝试使用R计算它,但我使用的代码出现了错误,我对是否继续下去感到困惑。以下是代码
AirPassengers[43]<-NA
AirPassengers[100]<-NA
Fun_mv = function(g,h){
g=104:622
n=length(g)
empty_matrix=matrix(nrow = n, ncol = n, dimnames = list(g,g))
for (i in g){
for (j in g){
AirPassengers[43]=i
AirPassengers[100]=j
empty_matrix[i,j]=acf(AirPassengers)$acf[2]
}
}
}
h=outer(g,g,FUN = Fun_mv);h
非常感谢您的帮助!
英文:
I'm trying to fill in 2 missing values. My lecturer suggests using the results of the largest autocorrelation with the following stages:
- Find minimal and maximum data from the dataset. Suppose I use AirPassanger data by omitting two data. The minimum data is 104 and the maximum data is 622.
- Calculations are performed for each combination of the two missing data by finding the autocorrelation (ACF) in lag 1. The experiment is carried out by replacing the two missing data with numbers in the minimum and maximum range (
104 <= x <= 622
). - Imputation of missing data is selected based on the largest autocorrelation.
- The expected output is in the form of a matrix of imputed time series autocorrelation results.
I'm trying to calculate it using R, but the code I used found an error and I'm confused about continuing this. Here is the code
AirPassengers[43]<-NA
AirPassengers[100]<-NA
Fun_mv = function(g,h){
g=104:622
n=length(g)
empty_matrix=matrix(nrow = n, ncol = n, dimnames = list(g,g))
for (i in g){
for (j in g){
AirPassengers[43]=i
AirPassengers[100]=j
empty_matrix[i,j]=acf(AirPassengers)$acf[2]
}
}
}
h=outer(g,g,FUN = Fun_mv);h
Any help is greatly appreciated!
get the correct code
答案1
得分: 0
在outer
函数中没有必要调用,函数的双重循环已经完成了。请注意,将NA
赋值给AirPassengers[43]
在内部循环之外。而且,acf(., plot = FALSE)
的赋值可以节省大量时间。
英文:
There is no need for a call to outer
, the function's double loop already does it.
Note that the assignment of NA
to AirPassengers[43]
is outside the inner loop. And that acf(., plot = FALSE)
saves a lot of time.
Fun_mv <- function(g, h){
n <- length(g)
empty_matrix <- matrix(nrow = n, ncol = n, dimnames = list(g, g))
for (i in seq_along(g)){
AirPassengers[43] <- g[i]
for (j in seq_along(g)){
AirPassengers[100] <- h[j]
empty_matrix[i, j] <- acf(AirPassengers, plot = FALSE)$acf[2]
}
}
empty_matrix
}
AirPassengers[43] <- NA
AirPassengers[100] <- NA
g <- 104:622
h <- Fun_mv(g, g)
str(h)
#> num [1:519, 1:519] 0.871 0.871 0.871 0.871 0.871 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:519] "104" "105" "106" "107" ...
#> ..$ : chr [1:519] "104" "105" "106" "107" ...
<sup>Created on 2023-05-13 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论