2023年4月13日 18:55:04go评论111阅读模式

英文:

Unable to use read.zoo due to the presence of NAs

问题

I have a large dataset of irregular multivariate timeseries that I want to convert with read.zoo.

一些最后几行填充了NAs。当我运行read.zoo包括带有NAs的行时，我收到以下错误消息："index has bad entries at data rows: 43 44 ..."

当我检查is.na()时，NA单元格显示为TRUE。我尝试了来自这里的na.fill解决方案，但它不起作用。

下面是一个包含两个变量Var1和Var2以及它们的日期date1和date2的数据集摘录：

2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
...
（后续数据省略）
...
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA
NA NA NA NA

英文:

I have a large dataset of irregular multivariate timeseries that I want to convert with read.zoo.

Some of the last rows are populated with NAs. When I run read.zoo including the rows with the NAs, I get the following error message: "index has bad entries at data rows: 43 44 ...".

When I check is.na() the NA cells indicate TRUE. And I tried the na.fill solution from here, but it doesn't work.

Below is an extract of the dataset with two variables Var1 and Var2 with their respective dates date1 and date2:

date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
2023-01-18 100.933 2023-01-16 99.762
2023-01-19 100.641 2023-01-17 99.484
2023-01-20 100.148 2023-01-18 99.743
2023-01-23 99.972 2023-01-19 99.419
2023-01-24 100.256 2023-01-20 99.364
2023-01-25 100.348 2023-01-23 99.533
2023-01-26 100.146 2023-01-24 99.711
2023-01-27 100.063 2023-01-25 99.798
2023-01-30 99.649 2023-01-26 100.481
2023-01-31 99.822 2023-01-27 100.708
2023-02-01 99.885 2023-01-30 100.57
2023-02-02 101.121 2023-01-31 100.773
2023-02-03 100.854 2023-02-01 100.999
2023-02-06 100.5 2023-02-02 102.037
2023-02-07 100.272 2023-02-03 102.104
2023-02-08 100.372 2023-02-06 101.85
2023-02-09 100.659 2023-02-07 101.765
2023-02-10 100.421 2023-02-08 101.806
2023-02-13 100.418 2023-02-09 101.905
2023-02-14 100.202 2023-02-10 101.675
2023-02-15 99.913 2023-02-13 101.491
2023-02-16 99.832 2023-02-14 101.304
2023-02-17 99.911 2023-02-15 101.242
2023-02-20 99.791 2023-02-16 101.621
2023-02-21 99.451 2023-02-17 101.581
2023-02-22 99.467 2023-02-20 101.545
2023-02-23 99.642 2023-02-21 101.334
2023-02-24 99.278 2023-02-22 101.246
2023-02-27 99.114 2023-02-23 101.857
2023-02-28 98.784 2023-02-24 101.71
2023-03-01 98.486 2023-02-27 101.759
2023-03-02 98.396 2023-02-28 101.649
2023-03-03 98.467 2023-03-01 101.583
2023-03-06 98.276 2023-03-02 101.426
2023-03-07 98.495 2023-03-03 101.666
2023-03-08 98.572 2023-03-06 101.919
2023-03-09 98.747 2023-03-07 102.048
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA
NA NA NA NA

答案1

得分: 1

The solution was provided by @G. Grothendieck in another post here:

将 as.data.frame(x) 替换为 na.omit(as.data.frame(x))

英文:

The solution was provided by @G. Grothendieck in another post here:

Replace as.data.frame(x) with na.omit(as.data.frame(x))

答案2

得分: 0

首先，让我为您从您的数据创建一个数据框架：

lines <- "date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
2023-01-18 100.933 2023-01-16 99.762
2023-01-19 100.641 2023-01-17 99.484
2023-01-20 100.148 2023-01-18 99.743
2023-01-23 99.972 2023-01-19 99.419
2023-01-24 100.256 2023-01-20 99.364
2023-01-25 100.348 2023-01-23 99.533
2023-01-26 100.146 2023-01-24 99.711
2023-01-27 100.063 2023-01-25 99.798
2023-01-30 99.649 2023-01-26 100.481
2023-01-31 99.822 2023-01-27 100.708
2023-02-01 99.885 2023-01-30 100.57
2023-02-02 101.121 2023-01-31 100.773
2023-02-03 100.854 2023-02-01 100.999
2023-02-06 100.5 2023-02-02 102.037
2023-02-07 100.272 2023-02-03 102.104
2023-02-08 100.372 2023-02-06 101.85
2023-02-09 100.659 2023-02-07 101.765
2023-02-10 100.421 2023-02-08 101.806
2023-02-13 100.418 2023-02-09 101.905
2023-02-14 100.202 2023-02-10 101.675
2023-02-15 99.913 2023-02-13 101.491
2023-02-16 99.832 2023-02-14 101.304
2023-02-17 99.911 2023-02-15 101.242
2023-02-20 99.791 2023-02-16 101.621
2023-02-21 99.451 2023-02-17 101.581
2023-02-22 99.467 2023-02-20 101.545
2023-02-23 99.642 2023-02-21 101.334
2023-02-24 99.278 2023-02-22 101.246
2023-02-27 99.114 2023-02-23 101.857
2023-02-28 98.784 2023-02-24 101.71
2023-03-01 98.486 2023-02-27 101.759
2023-03-02 98.396 2023-02-28 101.649
2023-03-03 98.467 2023-03-01 101.583
2023-03-06 98.276 2023-03-02 101.426
2023-03-07 98.495 2023-03-03 101.666
2023-03-08 98.572 2023-03-06 101.919
2023-03-09 98.747 2023-03-07 102.048
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA"
library(tidyverse)
library(dplyr)
DF <- read.table(text = lines, header = TRUE)

然后，让我将日期格式化为正确的格式：

library(zoo)
# 将日期格式化为POSIXct格式
DF$date1 <- as.POSIXct(DF$date1)
DF$date2 <- as.POSIXct(DF$date2)

如果您想根据您的需求创建两个不同的数据集，可以这样做：

df1 <- DF %>% select(date1, Var1) %>% na.omit() %>% set_names(c("Date", "Var"))
df2 <- DF %>% select(date2, Var2) %>% na.omit() %>% set_names(c("Date", "Var"))

然后，将这些分开的数据集创建成zoo对象：

zoo1 <- zoo(df1$Var, order.by = df1$Date)
zoo2 <- zoo(df2$Var, order.by = df2$Date)

或者，如果您想合并这些变量，可以这样做：

# 合并上面创建的两个数据框架
mergedDf <- merge(df1, df2, by = "Date")
# 创建zoo对象
zooObject <- zoo(mergedDf$Var.x, order.by = mergedDf$Date)

希望这有所帮助。

英文:

first let me create a dataframe from your data:

lines &lt;- &quot;date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
2023-01-18 100.933 2023-01-16 99.762
2023-01-19 100.641 2023-01-17 99.484
2023-01-20 100.148 2023-01-18 99.743
2023-01-23 99.972 2023-01-19 99.419
2023-01-24 100.256 2023-01-20 99.364
2023-01-25 100.348 2023-01-23 99.533
2023-01-26 100.146 2023-01-24 99.711
2023-01-27 100.063 2023-01-25 99.798
2023-01-30 99.649 2023-01-26 100.481
2023-01-31 99.822 2023-01-27 100.708
2023-02-01 99.885 2023-01-30 100.57
2023-02-02 101.121 2023-01-31 100.773
2023-02-03 100.854 2023-02-01 100.999
2023-02-06 100.5 2023-02-02 102.037
2023-02-07 100.272 2023-02-03 102.104
2023-02-08 100.372 2023-02-06 101.85
2023-02-09 100.659 2023-02-07 101.765
2023-02-10 100.421 2023-02-08 101.806
2023-02-13 100.418 2023-02-09 101.905
2023-02-14 100.202 2023-02-10 101.675
2023-02-15 99.913 2023-02-13 101.491
2023-02-16 99.832 2023-02-14 101.304
2023-02-17 99.911 2023-02-15 101.242
2023-02-20 99.791 2023-02-16 101.621
2023-02-21 99.451 2023-02-17 101.581
2023-02-22 99.467 2023-02-20 101.545
2023-02-23 99.642 2023-02-21 101.334
2023-02-24 99.278 2023-02-22 101.246
2023-02-27 99.114 2023-02-23 101.857
2023-02-28 98.784 2023-02-24 101.71
2023-03-01 98.486 2023-02-27 101.759
2023-03-02 98.396 2023-02-28 101.649
2023-03-03 98.467 2023-03-01 101.583
2023-03-06 98.276 2023-03-02 101.426
2023-03-07 98.495 2023-03-03 101.666
2023-03-08 98.572 2023-03-06 101.919
2023-03-09 98.747 2023-03-07 102.048
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA
NA NA NA NA&quot;
library(tidyverse)
library(dplyr)
DF &lt;- read.table(text = lines, header = TRUE)

Then, let me format the dates in proper format:

library(zoo)
# format dates to POSIXct format
DF$date1 &lt;- as.POSIXct(DF$date1)
DF$date2 &lt;- as.POSIXct(DF$date2)

One way is to create two different datasets (looking at your requirement):

df1 &lt;- DF %&gt;% select(date1, Var1) %&gt;% na.omit() %&gt;% set_names(c(&quot;Date&quot;, &quot;Var&quot;))
df2 &lt;- DF %&gt;% select(date2, Var2) %&gt;% na.omit() %&gt;% set_names(c(&quot;Date&quot;, &quot;Var&quot;))

The create the separate zoo objects out of these:

zoo1 &lt;- zoo(df1$Var, order.by = df1$Date)
zoo2 &lt;- zoo(df2$Var, order.by = df2$Date)

Or if you want to merge these variables, you could do:

# merge both the dataframes created above
mergedDf &lt;- merge(df1, df2, by = &quot;Date&quot;)
# create the zoo object
zooObject &lt;- zoo(mergedDf$Var.x, order.by = mergedDf$Date)

Let me know if this helps.

答案3

得分: 0

以下是翻译好的部分：

在问题中，NA总是位于开头，因此使用Note末尾的Lines来定义N作为注释字符。

library(zoo)
z <- read.zoo(text = Lines, header = TRUE, comment.char = "N")

注释

Lines <- "date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
...
(后续部分省略)

英文:

In the question NA is always at the beginning so using Lines from the Note at the end define N as a comment character.

library(zoo)
z &lt;- read.zoo(text = Lines, header = TRUE, comment.chaqr = &quot;N&quot;)

Note

Lines &lt;- &quot;date1 Var1 date2 Var2
2023-01-13 100.325 2023-01-11 99.748
2023-01-16 100.378 2023-01-12 99.832
2023-01-17 100.826 2023-01-13 99.878
2023-01-18 100.933 2023-01-16 99.762
2023-01-19 100.641 2023-01-17 99.484
2023-01-20 100.148 2023-01-18 99.743
2023-01-23 99.972 2023-01-19 99.419
2023-01-24 100.256 2023-01-20 99.364
2023-01-25 100.348 2023-01-23 99.533
2023-01-26 100.146 2023-01-24 99.711
2023-01-27 100.063 2023-01-25 99.798
2023-01-30 99.649 2023-01-26 100.481
2023-01-31 99.822 2023-01-27 100.708
2023-02-01 99.885 2023-01-30 100.57
2023-02-02 101.121 2023-01-31 100.773
2023-02-03 100.854 2023-02-01 100.999
2023-02-06 100.5 2023-02-02 102.037
2023-02-07 100.272 2023-02-03 102.104
2023-02-08 100.372 2023-02-06 101.85
2023-02-09 100.659 2023-02-07 101.765
2023-02-10 100.421 2023-02-08 101.806
2023-02-13 100.418 2023-02-09 101.905
2023-02-14 100.202 2023-02-10 101.675
2023-02-15 99.913 2023-02-13 101.491
2023-02-16 99.832 2023-02-14 101.304
2023-02-17 99.911 2023-02-15 101.242
2023-02-20 99.791 2023-02-16 101.621
2023-02-21 99.451 2023-02-17 101.581
2023-02-22 99.467 2023-02-20 101.545
2023-02-23 99.642 2023-02-21 101.334
2023-02-24 99.278 2023-02-22 101.246
2023-02-27 99.114 2023-02-23 101.857
2023-02-28 98.784 2023-02-24 101.71
2023-03-01 98.486 2023-02-27 101.759
2023-03-02 98.396 2023-02-28 101.649
2023-03-03 98.467 2023-03-01 101.583
2023-03-06 98.276 2023-03-02 101.426
2023-03-07 98.495 2023-03-03 101.666
2023-03-08 98.572 2023-03-06 101.919
2023-03-09 98.747 2023-03-07 102.048
2023-03-10 99.489 2023-03-08 101.915
NA NA 2023-03-09 101.927
NA NA 2023-03-10 101.775
NA NA NA NA
NA NA NA NA
NA NA NA NA&quot;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

无法使用read.zoo，因为存在NAs。

问题

答案1

答案2

答案3

注释

Note

“Attachments functions return ‘Error: attempt to apply non-function'”

point.in.poly函数不起作用，因为它已被弃用。

使用`source()`与Shiny模块。

如何使用函数 R sweep

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。