英文:
Error in approxfun, need at least two non-NA values to interpolate
问题
我试图使用approxfun来计算插值来填充缺失值:
column_name <- colnames(vndusd_merged);
lapply(column_name, function(x){
if(x != "Date"){
interpl <- approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)]);
vndusd_merged$x <- interpl(vndusd_merged$Date);
}
})
我一直收到这个错误:
Error in approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)]) :
need at least two non-NA values to interpolate
4.
stop("need at least two non-NA values to interpolate")
3.
approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)])
2.
FUN(X[[i]], ...)
1.
lapply(column_name, function(x) {
if (x != "Date") {
interpl <- approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)],
vndusd_merged$x[!is.na(vndusd_merged$x)]) ...
以下是<code>vndusd_merged</code>的前20行示例。列"Date"没有任何N/A:
Date Ask.Close Bid.Close
1 01/01/2014 21115 21075
2 02/01/2014 21160 21060
3 03/01/2014 21115 21075
4 04/01/2014 NA NA
5 05/01/2014 NA NA
6 06/01/2014 21120 21080
7 07/01/2014 21115 21075
8 08/01/2014 21120 21080
9 09/01/2014 21115 21075
10 10/01/2014 21110 21072
11 11/01/2014 NA NA
12 12/01/2014 NA NA
13 13/01/2014 21120 21060
14 14/01/2014 21110 21072
15 15/01/2014 21110 21070
16 16/01/2014 21120 21080
17 17/01/2014 21110 21070
18 18/01/2014 NA NA
19 19/01/2014 NA NA
20 20/01/2014 21110 21070
我尝试手动插入列名运行它,但仍然收到相同的错误:
interpl <- aproxfun(vndusd_merged$Date[!is.na(vndusd_merged$Ask.Close)], vndusd_merged$Ask.Close[!is.na(vndusd_merged$Ask.Close)]);
我该如何解决这个问题?
英文:
I'm trying to use approxfun to calculate missing value using interpolate:
column_name <- colnames(vndusd_merged);
lapply(column_name, function(x){
if(x != "Date"){
interpl <- approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)]);
vndusd_merged$x <- interpl(vndusd_merged$Date);
}
})
I keep getting this error:
Error in approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)]) :
need at least two non-NA values to interpolate
4.
stop("need at least two non-NA values to interpolate")
3.
approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)])
2.
FUN(X[[i]], ...)
1.
lapply(column_name, function(x) {
if (x != "Date") {
interpl <- approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)],
vndusd_merged$x[!is.na(vndusd_merged$x)]) ...
Here are the sample of the first 20 row of <code>vndusd_merged</code>. The column "Date" does not have any N/A
Date Ask.Close Bid.Close
1 01/01/2014 21115 21075
2 02/01/2014 21160 21060
3 03/01/2014 21115 21075
4 04/01/2014 NA NA
5 05/01/2014 NA NA
6 06/01/2014 21120 21080
7 07/01/2014 21115 21075
8 08/01/2014 21120 21080
9 09/01/2014 21115 21075
10 10/01/2014 21110 21072
11 11/01/2014 NA NA
12 12/01/2014 NA NA
13 13/01/2014 21120 21060
14 14/01/2014 21110 21072
15 15/01/2014 21110 21070
16 16/01/2014 21120 21080
17 17/01/2014 21110 21070
18 18/01/2014 NA NA
19 19/01/2014 NA NA
20 20/01/2014 21110 21070
I tried to run it by inserting the column name manually but I still got the same error.
interpl <- aproxfun(vndusd_merged$Date[!is.na(vndusd_merged$Ask.Close)], vndusd_merged$Ask.Close[!is.na(vndusd_merged$Ask.Close)]);
How can I solve this problem?
答案1
得分: 1
你可以更加简洁地使用 approx
函数来实现相同的效果。
ip <- sapply(vndusd_merged[-1], function(x) with(vndusd_merged, approx(Date, x, xout=Date)$y))
cbind(vndusd_merged[1], ip)
# Date Ask.Close Bid.Close
# 1 01/01/2014 21115.00 21075.00
# 2 02/01/2014 21160.00 21060.00
# 3 03/01/2014 21115.00 21075.00
# 4 04/01/2014 21116.67 21076.67
# 5 05/01/2014 21118.33 21078.33
# 6 06/01/2014 21120.00 21080.00
# 7 07/01/2014 21115.00 21075.00
# 8 08/01/2014 21120.00 21080.00
# 9 09/01/2014 21115.00 21075.00
# 10 10/01/2014 21110.00 21072.00
# 11 11/01/2014 21113.33 21068.00
# 12 12/01/2014 21116.67 21064.00
# 13 13/01/2014 21120.00 21060.00
# 14 14/01/2014 21110.00 21072.00
# 15 15/01/2014 21110.00 21070.00
# 16 16/01/2014 21120.00 21080.00
# 17 17/01/2014 21110.00 21070.00
# 18 18/01/2014 21110.00 21070.00
# 19 19/01/2014 21110.00 21070.00
# 20 20/01/2014 21110.00 21070.00
数据:
vndusd_merged <- structure(list(Date = structure(1:20, .Label = c("01/01/2014",
"02/01/2014", "03/01/2014", "04/01/2014", "05/01/2014", "06/01/2014",
"07/01/2014", "08/01/2014", "09/01/2014", "10/01/2014", "11/01/2014",
"12/01/2014", "13/01/2014", "14/01/2014", "15/01/2014", "16/01/2014",
"17/01/2014", "18/01/2014", "19/01/2014", "20/01/2014"), class = "factor"),
Ask.Close = c(21115L, 21160L, 21115L, NA, NA, 21120L, 21115L,
21120L, 21115L, 21110L, NA, NA, 21120L, 21110L, 21110L, 21120L,
21110L, NA, NA, 21110L), Bid.Close = c(21075L, 21060L, 21075L,
NA, NA, 21080L, 21075L, 21080L, 21075L, 21072L, NA, NA, 21060L,
21072L, 21070L, 21080L, 21070L, NA, NA, 21070L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"))
英文:
You could do the same a little more concise using approx
.
ip <- sapply(vndusd_merged[-1], function(x) with(vndusd_merged, approx(Date, x, xout=Date)$y))
cbind(vndusd_merged[1], ip)
# Date Ask.Close Bid.Close
# 1 01/01/2014 21115.00 21075.00
# 2 02/01/2014 21160.00 21060.00
# 3 03/01/2014 21115.00 21075.00
# 4 04/01/2014 21116.67 21076.67
# 5 05/01/2014 21118.33 21078.33
# 6 06/01/2014 21120.00 21080.00
# 7 07/01/2014 21115.00 21075.00
# 8 08/01/2014 21120.00 21080.00
# 9 09/01/2014 21115.00 21075.00
# 10 10/01/2014 21110.00 21072.00
# 11 11/01/2014 21113.33 21068.00
# 12 12/01/2014 21116.67 21064.00
# 13 13/01/2014 21120.00 21060.00
# 14 14/01/2014 21110.00 21072.00
# 15 15/01/2014 21110.00 21070.00
# 16 16/01/2014 21120.00 21080.00
# 17 17/01/2014 21110.00 21070.00
# 18 18/01/2014 21110.00 21070.00
# 19 19/01/2014 21110.00 21070.00
# 20 20/01/2014 21110.00 21070.00
Data:
vndusd_merged <- structure(list(Date = structure(1:20, .Label = c("01/01/2014",
"02/01/2014", "03/01/2014", "04/01/2014", "05/01/2014", "06/01/2014",
"07/01/2014", "08/01/2014", "09/01/2014", "10/01/2014", "11/01/2014",
"12/01/2014", "13/01/2014", "14/01/2014", "15/01/2014", "16/01/2014",
"17/01/2014", "18/01/2014", "19/01/2014", "20/01/2014"), class = "factor"),
Ask.Close = c(21115L, 21160L, 21115L, NA, NA, 21120L, 21115L,
21120L, 21115L, 21110L, NA, NA, 21120L, 21110L, 21110L, 21120L,
21110L, NA, NA, 21110L), Bid.Close = c(21075L, 21060L, 21075L,
NA, NA, 21080L, 21075L, 21080L, 21075L, 21072L, NA, NA, 21060L,
21072L, 21070L, 21080L, 21070L, NA, NA, 21070L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"))
答案2
得分: 0
以下代码执行了问题所要求的操作。
vndusd_merged$Date <- as.Date(vndusd_merged$Date, "%d/%m/%Y")
vndusd_merged[-1] <- lapply(vndusd_merged[-1], function(x){
i <- !is.na(x)
f <- approxfun(vndusd_merged$Date[i], x[i])
y <- f(vndusd_merged$Date)
y
})
vndusd_merged
# Date Ask.Close Bid.Close
#1 2014-01-01 21115.00 21075.00
#2 2014-01-02 21160.00 21060.00
#3 2014-01-03 21115.00 21075.00
#4 2014-01-04 21116.67 21076.67
#5 2014-01-05 21118.33 21078.33
#6 2014-01-06 21120.00 21080.00
#7 2014-01-07 21115.00 21075.00
#8 2014-01-08 21120.00 21080.00
#9 2014-01-09 21115.00 21075.00
#10 2014-01-10 21110.00 21072.00
#11 2014-01-11 21113.33 21068.00
#12 2014-01-12 21116.67 21064.00
#13 2014-01-13 21120.00 21060.00
#14 2014-01-14 21110.00 21072.00
#15 2014-01-15 21110.00 21070.00
#16 2014-01-16 21120.00 21080.00
#17 2014-01-17 21110.00 21070.00
#18 2014-01-18 21110.00 21070.00
#19 2014-01-19 21110.00 21070.00
#20 2014-01-20 21110.00 21070.00
如果要使用不等于 "Date"
的列名向量,可以使用上面的代码,但应用于不同的子数据框。
column_name <- colnames(vndusd_merged)
column_name <- column_name[column_name != "Date"]
vndusd_merged[column_name] <- lapply(vndusd_merged[column_name], function(x){
#与上面的代码相同
})
英文:
The following code does what the question asks for.
vndusd_merged$Date <- as.Date(vndusd_merged$Date, "%d/%m/%Y")
vndusd_merged[-1] <- lapply(vndusd_merged[-1], function(x){
i <- !is.na(x)
f <- approxfun(vndusd_merged$Date[i], x[i])
y <- f(vndusd_merged$Date)
y
})
vndusd_merged
# Date Ask.Close Bid.Close
#1 2014-01-01 21115.00 21075.00
#2 2014-01-02 21160.00 21060.00
#3 2014-01-03 21115.00 21075.00
#4 2014-01-04 21116.67 21076.67
#5 2014-01-05 21118.33 21078.33
#6 2014-01-06 21120.00 21080.00
#7 2014-01-07 21115.00 21075.00
#8 2014-01-08 21120.00 21080.00
#9 2014-01-09 21115.00 21075.00
#10 2014-01-10 21110.00 21072.00
#11 2014-01-11 21113.33 21068.00
#12 2014-01-12 21116.67 21064.00
#13 2014-01-13 21120.00 21060.00
#14 2014-01-14 21110.00 21072.00
#15 2014-01-15 21110.00 21070.00
#16 2014-01-16 21120.00 21080.00
#17 2014-01-17 21110.00 21070.00
#18 2014-01-18 21110.00 21070.00
#19 2014-01-19 21110.00 21070.00
#20 2014-01-20 21110.00 21070.00
If you want to use a vector of column names, in this case not equal to "Date"
, use code above but applied to a different sub-dataframe.
column_name <- colnames(vndusd_merged)
column_name <- column_name[column_name != "Date"]
vndusd_merged[column_name] <- lapply(vndusd_merged[column_name], function(x){
#same code as above
})
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论