无法使用read.zoo,因为存在NAs。

huangapple go评论64阅读模式
英文:

Unable to use read.zoo due to the presence of NAs

问题

I have a large dataset of irregular multivariate timeseries that I want to convert with read.zoo.

一些最后几行填充了NAs。当我运行read.zoo包括带有NAs的行时,我收到以下错误消息:"index has bad entries at data rows: 43 44 ..."

当我检查is.na()时,NA单元格显示为TRUE。我尝试了来自这里的na.fill解决方案,但它不起作用。

下面是一个包含两个变量Var1和Var2以及它们的日期date1和date2的数据集摘录:

2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
...
(后续数据省略)
...
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA
NA NA NA NA
英文:

I have a large dataset of irregular multivariate timeseries that I want to convert with read.zoo.

Some of the last rows are populated with NAs. When I run read.zoo including the rows with the NAs, I get the following error message: "index has bad entries at data rows: 43 44 ...".

When I check is.na() the NA cells indicate TRUE. And I tried the na.fill solution from here, but it doesn't work.

Below is an extract of the dataset with two variables Var1 and Var2 with their respective dates date1 and date2:

date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
2023-01-18 100.933 2023-01-16 99.762
2023-01-19 100.641 2023-01-17 99.484
2023-01-20 100.148 2023-01-18 99.743
2023-01-23 99.972 2023-01-19 99.419
2023-01-24 100.256 2023-01-20 99.364
2023-01-25 100.348 2023-01-23 99.533
2023-01-26 100.146 2023-01-24 99.711
2023-01-27 100.063 2023-01-25 99.798
2023-01-30 99.649 2023-01-26 100.481
2023-01-31 99.822 2023-01-27 100.708
2023-02-01 99.885 2023-01-30 100.57
2023-02-02 101.121 2023-01-31 100.773
2023-02-03 100.854 2023-02-01 100.999
2023-02-06 100.5 2023-02-02 102.037
2023-02-07 100.272 2023-02-03 102.104
2023-02-08 100.372 2023-02-06 101.85
2023-02-09 100.659 2023-02-07 101.765
2023-02-10 100.421 2023-02-08 101.806
2023-02-13 100.418 2023-02-09 101.905
2023-02-14 100.202 2023-02-10 101.675
2023-02-15 99.913 2023-02-13 101.491
2023-02-16 99.832 2023-02-14 101.304
2023-02-17 99.911 2023-02-15 101.242
2023-02-20 99.791 2023-02-16 101.621
2023-02-21 99.451 2023-02-17 101.581
2023-02-22 99.467 2023-02-20 101.545
2023-02-23 99.642 2023-02-21 101.334
2023-02-24 99.278 2023-02-22 101.246
2023-02-27 99.114 2023-02-23 101.857
2023-02-28 98.784 2023-02-24 101.71
2023-03-01 98.486 2023-02-27 101.759
2023-03-02 98.396 2023-02-28 101.649
2023-03-03 98.467 2023-03-01 101.583
2023-03-06 98.276 2023-03-02 101.426
2023-03-07 98.495 2023-03-03 101.666
2023-03-08 98.572 2023-03-06 101.919
2023-03-09 98.747 2023-03-07 102.048
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA
NA NA NA NA

答案1

得分: 1

The solution was provided by @G. Grothendieck in another post here:

将 as.data.frame(x) 替换为 na.omit(as.data.frame(x))

英文:

The solution was provided by @G. Grothendieck in another post here:

Replace as.data.frame(x) with na.omit(as.data.frame(x))

答案2

得分: 0

首先,让我为您从您的数据创建一个数据框架:

lines <- "date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
2023-01-18 100.933 2023-01-16 99.762
2023-01-19 100.641 2023-01-17 99.484
2023-01-20 100.148 2023-01-18 99.743
2023-01-23 99.972 2023-01-19 99.419
2023-01-24 100.256 2023-01-20 99.364
2023-01-25 100.348 2023-01-23 99.533
2023-01-26 100.146 2023-01-24 99.711
2023-01-27 100.063 2023-01-25 99.798
2023-01-30 99.649 2023-01-26 100.481
2023-01-31 99.822 2023-01-27 100.708
2023-02-01 99.885 2023-01-30 100.57
2023-02-02 101.121 2023-01-31 100.773
2023-02-03 100.854 2023-02-01 100.999
2023-02-06 100.5 2023-02-02 102.037
2023-02-07 100.272 2023-02-03 102.104
2023-02-08 100.372 2023-02-06 101.85
2023-02-09 100.659 2023-02-07 101.765
2023-02-10 100.421 2023-02-08 101.806
2023-02-13 100.418 2023-02-09 101.905
2023-02-14 100.202 2023-02-10 101.675
2023-02-15 99.913 2023-02-13 101.491
2023-02-16 99.832 2023-02-14 101.304
2023-02-17 99.911 2023-02-15 101.242
2023-02-20 99.791 2023-02-16 101.621
2023-02-21 99.451 2023-02-17 101.581
2023-02-22 99.467 2023-02-20 101.545
2023-02-23 99.642 2023-02-21 101.334
2023-02-24 99.278 2023-02-22 101.246
2023-02-27 99.114 2023-02-23 101.857
2023-02-28 98.784 2023-02-24 101.71
2023-03-01 98.486 2023-02-27 101.759
2023-03-02 98.396 2023-02-28 101.649
2023-03-03 98.467 2023-03-01 101.583
2023-03-06 98.276 2023-03-02 101.426
2023-03-07 98.495 2023-03-03 101.666
2023-03-08 98.572 2023-03-06 101.919
2023-03-09 98.747 2023-03-07 102.048
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA"

library(tidyverse)
library(dplyr)

DF <- read.table(text = lines, header = TRUE)

然后,让我将日期格式化为正确的格式:

library(zoo)

# 将日期格式化为POSIXct格式
DF$date1 <- as.POSIXct(DF$date1)
DF$date2 <- as.POSIXct(DF$date2)

如果您想根据您的需求创建两个不同的数据集,可以这样做:

df1 <- DF %>% select(date1, Var1) %>% na.omit() %>% set_names(c("Date", "Var"))
df2 <- DF %>% select(date2, Var2) %>% na.omit() %>% set_names(c("Date", "Var"))

然后,将这些分开的数据集创建成zoo对象:

zoo1 <- zoo(df1$Var, order.by = df1$Date)
zoo2 <- zoo(df2$Var, order.by = df2$Date)

或者,如果您想合并这些变量,可以这样做:

# 合并上面创建的两个数据框架
mergedDf <- merge(df1, df2, by = "Date")

# 创建zoo对象
zooObject <- zoo(mergedDf$Var.x, order.by = mergedDf$Date)

希望这有所帮助。

英文:

first let me create a dataframe from your data:

lines &lt;- &quot;date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
2023-01-18 100.933 2023-01-16 99.762
2023-01-19 100.641 2023-01-17 99.484
2023-01-20 100.148 2023-01-18 99.743
2023-01-23 99.972 2023-01-19 99.419
2023-01-24 100.256 2023-01-20 99.364
2023-01-25 100.348 2023-01-23 99.533
2023-01-26 100.146 2023-01-24 99.711
2023-01-27 100.063 2023-01-25 99.798
2023-01-30 99.649 2023-01-26 100.481
2023-01-31 99.822 2023-01-27 100.708
2023-02-01 99.885 2023-01-30 100.57
2023-02-02 101.121 2023-01-31 100.773
2023-02-03 100.854 2023-02-01 100.999
2023-02-06 100.5 2023-02-02 102.037
2023-02-07 100.272 2023-02-03 102.104
2023-02-08 100.372 2023-02-06 101.85
2023-02-09 100.659 2023-02-07 101.765
2023-02-10 100.421 2023-02-08 101.806
2023-02-13 100.418 2023-02-09 101.905
2023-02-14 100.202 2023-02-10 101.675
2023-02-15 99.913 2023-02-13 101.491
2023-02-16 99.832 2023-02-14 101.304
2023-02-17 99.911 2023-02-15 101.242
2023-02-20 99.791 2023-02-16 101.621
2023-02-21 99.451 2023-02-17 101.581
2023-02-22 99.467 2023-02-20 101.545
2023-02-23 99.642 2023-02-21 101.334
2023-02-24 99.278 2023-02-22 101.246
2023-02-27 99.114 2023-02-23 101.857
2023-02-28 98.784 2023-02-24 101.71
2023-03-01 98.486 2023-02-27 101.759
2023-03-02 98.396 2023-02-28 101.649
2023-03-03 98.467 2023-03-01 101.583
2023-03-06 98.276 2023-03-02 101.426
2023-03-07 98.495 2023-03-03 101.666
2023-03-08 98.572 2023-03-06 101.919
2023-03-09 98.747 2023-03-07 102.048
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA
NA NA NA NA&quot;


library(tidyverse)
library(dplyr)

DF &lt;- read.table(text = lines, header = TRUE)

Then, let me format the dates in proper format:

library(zoo)

# format dates to POSIXct format
DF$date1 &lt;- as.POSIXct(DF$date1)
DF$date2 &lt;- as.POSIXct(DF$date2)

One way is to create two different datasets (looking at your requirement):

df1 &lt;- DF %&gt;% select(date1, Var1) %&gt;% na.omit() %&gt;% set_names(c(&quot;Date&quot;, &quot;Var&quot;))
df2 &lt;- DF %&gt;% select(date2, Var2) %&gt;% na.omit() %&gt;% set_names(c(&quot;Date&quot;, &quot;Var&quot;))

The create the separate zoo objects out of these:

zoo1 &lt;- zoo(df1$Var, order.by = df1$Date)
zoo2 &lt;- zoo(df2$Var, order.by = df2$Date)

Or if you want to merge these variables, you could do:

# merge both the dataframes created above
mergedDf &lt;- merge(df1, df2, by = &quot;Date&quot;)

# create the zoo object
zooObject &lt;- zoo(mergedDf$Var.x, order.by = mergedDf$Date)

Let me know if this helps.

答案3

得分: 0

以下是翻译好的部分:

在问题中,NA总是位于开头,因此使用Note末尾的Lines来定义N作为注释字符。

library(zoo)
z <- read.zoo(text = Lines, header = TRUE, comment.char = "N")

注释

Lines <- "date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
...
(后续部分省略)
英文:

In the question NA is always at the beginning so using Lines from the Note at the end define N as a comment character.

library(zoo)
z &lt;- read.zoo(text = Lines, header = TRUE, comment.chaqr = &quot;N&quot;)

Note

Lines &lt;- &quot;date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
2023-01-18 100.933 2023-01-16 99.762
2023-01-19 100.641 2023-01-17 99.484
2023-01-20 100.148 2023-01-18 99.743
2023-01-23 99.972 2023-01-19 99.419
2023-01-24 100.256 2023-01-20 99.364
2023-01-25 100.348 2023-01-23 99.533
2023-01-26 100.146 2023-01-24 99.711
2023-01-27 100.063 2023-01-25 99.798
2023-01-30 99.649 2023-01-26 100.481
2023-01-31 99.822 2023-01-27 100.708
2023-02-01 99.885 2023-01-30 100.57
2023-02-02 101.121 2023-01-31 100.773
2023-02-03 100.854 2023-02-01 100.999
2023-02-06 100.5 2023-02-02 102.037
2023-02-07 100.272 2023-02-03 102.104
2023-02-08 100.372 2023-02-06 101.85
2023-02-09 100.659 2023-02-07 101.765
2023-02-10 100.421 2023-02-08 101.806
2023-02-13 100.418 2023-02-09 101.905
2023-02-14 100.202 2023-02-10 101.675
2023-02-15 99.913 2023-02-13 101.491
2023-02-16 99.832 2023-02-14 101.304
2023-02-17 99.911 2023-02-15 101.242
2023-02-20 99.791 2023-02-16 101.621
2023-02-21 99.451 2023-02-17 101.581
2023-02-22 99.467 2023-02-20 101.545
2023-02-23 99.642 2023-02-21 101.334
2023-02-24 99.278 2023-02-22 101.246
2023-02-27 99.114 2023-02-23 101.857
2023-02-28 98.784 2023-02-24 101.71
2023-03-01 98.486 2023-02-27 101.759
2023-03-02 98.396 2023-02-28 101.649
2023-03-03 98.467 2023-03-01 101.583
2023-03-06 98.276 2023-03-02 101.426
2023-03-07 98.495 2023-03-03 101.666
2023-03-08 98.572 2023-03-06 101.919
2023-03-09 98.747 2023-03-07 102.048
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA
NA NA NA NA&quot;

huangapple
  • 本文由 发表于 2023年4月13日 18:55:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76004585.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定