匹配输入和输出之间的R时间序列类别

huangapple go评论57阅读模式
英文:

Matching R time series class between input and output

问题

我有一个分析常规时间序列的R函数,最初是假设数据只是一个向量,长度为N。它输出一个向量的列表,这些向量要么长度为N,要么长度为N+1,添加了一个t=0的结果。所以这是一个概念性的例子:

tsf <- function(ts){
  extend <- c(0.5,ts)     # 在ts的第一个时间步之前
  twotimes <- 2 * ts
  return (list(extend=extend, twotimes=twotimes))
}

该函数旨在用于具有(可能是不连续的)基于日期时间的正规时间索引和一定数量分钟的时间步长的单变量地球物理系列数据。对于我来说,我更熟悉Pandas/Python,所以对于R中的时间序列类,似乎包括zoo、xts以及只是时间序列类或甚至带有POSIXct列的数据框的可选性很广泛。

我的问题是,如何以最简单(代码量少,依赖性低)的方式解决匹配问题,以使extendtwotimests的类相同。例如,给定以下数据:

tseq <- seq(from = as.POSIXct('2005-01-01 00:00',tz=''), length.out = 5, by = "15 min")
x <- rnorm(5)
xdata <- xts(data=x, order.by=tseq)
zdata <- zooreg(data=x, order.by=tseq)

以下方式(实际上不是功能性的):

tsf(x)
tsf(xdata)
tsf(zdata)

是否可以使它们都生成与它们的参数类似的输出?R的方式是创建tsf.zoo、tsf.xts等类型特定的函数,并让这些函数专注于输入和输出的转换以及索引吗?如果用户选择不安装其中一个库,这些类型特定的函数会引发问题吗?

英文:

I have an R function that analyzes regular time series, originally written by assuming the data were just a vector, say of length N. It outputs a list of vectors that are either of length N or length N+1 adding a t=0 result. So here is a conceptual example:

tsf &lt;- function(ts){
  extend &lt;- c(0.5,ts)     # A time step before the first in ts
  twotimes &lt;- 2.*ts
  return (list(extend=extend,twotimes=twotimes))
}

the function is intended for univariate geophysical series data that have a (possibly gappy) regular time index based on datetimes and a time step that is some number of minutes. I'm more fluent in Pandas/Python, so to me the playing field for time series classes in R seems wide, possibly including zoo, xts as well as just the time series class or even a data frame with POSIXct column. In xts:

My question what is the simplest (small code, low dependency) way to address the matching problem so that extend and twotimes are of the same class as ts. For instance given this data:

tseq &lt;- seq(from = as.POSIXct(&#39;2005-01-01 00:00&#39;,tz=&#39;&#39;), length.out = 5, by = &quot;15 min&quot;)
x &lt;- rnorm(5)
xdata &lt;- xts(data=x,order.by=tseq)
zdata &lt;- zooreg(data=x,order.by=tseq)

Could the following (which are not really functional):

tsf(x)
tsf(xdata)
tsf(zdata)

be made to both produce like output as their argument? Is the R way to create a tsf.zoo, tsf.xts etc and have these focus on incoming and outgoing coersion and the indexing? Will these type-specific functions cause issues if the user elects not to install one of the libraries like xts?

答案1

得分: 1

R使用面向对象的编程,允许通过为不同类别定义方法,使通用函数对其进行操作。对于*,已经为数字、zooreg和xts类别定义了方法,对于extend,我们可以定义一个通用默认方法,涵盖了数字和zooreg(以及ts类),并为xts和data.frame定义单独的方法。

我们已经在定义xdata和zdata时纠正了问题 —— data不是第一个参数的名称,zooreg不接受order.by参数,而且需要使用set.seed来使使用随机数的测试可重现。请参见末尾的注释以及相关帮助文件。

library(xts) # 也引入了zoo

tsf <- function(x) list(extend = extend(x), twotimes = 2 * x)

extend <- function(x, ...) UseMethod("extend")
extend.default <- function(x, ...) replace(c(head(stats::lag(x), 1), x), 1, 0.5)
extend.xts <- function(x, ...) as.xts(extend(as.zooreg(x)))
extend.data.frame <- function(x, ...) {
  setNames(fortify.zoo(extend(as.zooreg(read.zoo(x)))), names(x))
}

# 测试
tsf(x)
tsf(xdata)
tsf(zdata)
tsf(ts(x))

另一种方法是将输入转换为zooreg,对其进行操作,然后再转换回来。在这种情况下,对于数字、zoo和xts,使用这个通用方法,然后为data.frame使用单独的方法。

extend <- function(x, ...) UseMethod("extend")
extend.default <- function(x, ...) {
  z <- as.zooreg(x)
  extend <- c(head(stats::lag(z), 1), x)
  extend[1] <- 0.5
  do.call(paste("as", data.class(x), sep = "."), list(extend))
}
extend.data.frame <- function(x, ...) {
  setNames(fortify.zoo(extend(as.zooreg(read.zoo(x)))), names(x))
}

注释

library(xts)
tseq <- seq(from = as.POSIXct('2005-01-01 00:00:00', tz = ''), length.out = 5,
  by = "15 min")
set.seed(123)
x <- rnorm(5)
xdata <- xts(x, tseq)
zdata <- as.zooreg(zoo(x, tseq))
英文:

R uses object oriented programming which allows a generic function to act on different classes by defining methods for them. For * there are already methods defined for numeric, zooreg and xts classes and for extend that we can define a single default method that covers numeric and zooreg (and ts class) and define a separate methods for xts and data.frame.

We have corrected the errors in the question in defining xdata and zdata -- data is not the name of the first argument, zooreg does not take an order.by argument and set.seed is needed to make tests using random numbers reproducible. See Note at end and relevant help files.

library (xts) # also pulls in zoo

tsf &lt;- function(x) list(extend = extend(x), twotimes = 2 * x)

extend &lt;- function(x, ...) UseMethod(&quot;extend&quot;)
extend.default &lt;- function(x, ...)&#160; replace(c(head(stats::lag(x), 1), x), 1, 0.5)
extend.xts &lt;- function(x, ...) as.xts(extend(as.zooreg(x)))
extend.data.frame &lt;- function(x, ...) {
  setNames(fortify.zoo(extend(as.zooreg(read.zoo(x)))), names(x))
}

# test
tsf(x)
tsf(xdata)
tsf(zdata)
tsf(ts(x))

Another approach is to convert the input to zooreg, operate on it and convert back. In that case use this single method for numeric, zoo and xts and a separate method for data.frame.

extend &lt;- function(x, ...) UseMethod(&quot;extend&quot;)
extend.default &lt;- function(x, ...) {
  z &lt;- as.zooreg(x)
  extend &lt;- c(head(stats::lag(z), 1), x)
  extend[1] &lt;- 0.5
  do.call(paste(&quot;as&quot;, data.class(x), sep = &quot;.&quot;), list(extend))
}
extend.data.frame &lt;- function(x, ...) {
  setNames(fortify.zoo(extend(as.zooreg(read.zoo(x)))), names(x))
}

Note

library(xts)
tseq &lt;- seq(from = as.POSIXct(&#39;2005-01-01 00:00:00&#39;, tz = &#39;&#39;), length.out = 5,
  by = &quot;15 min&quot;)
set.seed(123)
x &lt;- rnorm(5)
xdata &lt;- xts(x, tseq)
zdata &lt;- as.zooreg(zoo(x, tseq))

huangapple
  • 本文由 发表于 2023年6月19日 02:21:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76501984.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定