2023年2月14日 06:28:23go评论91阅读模式

英文:

Applying user defined function to rolling window with zoo

问题

我正在研究一个项目，我想重新创建一篇文章中提到的市场效率指标。由于我正在处理一个大型数据集，我决定在R中自动化这个过程。首先，我定义了一个函数，该函数返回该指标中使用的标准化贝塔系数，下面是一个可重现的示例：

beta_hats = function(j) {
  step1 = ar(j, aic = TRUE)$asy.var.coef
  step2 = ar(j, aic = TRUE)$ar
  step3 = chol(step1)
  step4 = t(step3)
  step5 = solve(step4)
  step6 = step5 %*% step2
  step7 = abs(step6)
  step8 = sum(step7)
    return(step8)
}
repro = data.frame(rnorm(3000, 0.0003563425, 0.0216025))
beta_hats(repro)
> beta_hats(repro)
[1] 1.587869

这将生成整个数据集的所需结果，然而，我希望我的指标是随时间变化的，所以我尝试在滚动窗口上重复这个函数。

y = repro
t = 250
library(zoo)
z = rollapplyr(y, t, function(y) beta_hats(y))
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
'data' must be of a vector type, was 'NULL'

在这一点上，函数不再起作用。有人可以帮助我解决这个问题吗？

附加信息：

在可重现示例中不添加data.frame()规范会在整个数据集的函数处产生相同的错误。
由于可重现示例完全是随机的，如果您决定使用真实市场收益来重现错误，该函数可能会产生更高的值。
数据集的class()返回"tbl_df"，"tbl"，"data.frame"。

英文:

I am working on a research project, and I wanted to recreate a market efficiency measure I have read about in an article. Since I am working on a large data set I decided to automate the process in R. First, I defined a function which returns the standardized beta coefficients used in the measure, here showed with a reproducible example:

beta_hats = function(j) {
  step1 = ar(j, aic = TRUE)$asy.var.coef
  step2 = ar(j, aic = TRUE)$ar
  step3 = chol(step1)
  step4 = t(step3)
  step5 = solve(step4)
  step6 = step5 %*% step2
  step7 = abs(step6)
  step8 = sum(step7)
    return(step8)
}
repro = data.frame(rnorm(3000, 0.0003563425, 0.0216025))
beta_hats(repro)
&gt; beta_hats(repro)
[1] 1.587869

This generates the desired outcome for the entire data set, however, I want my measure to be time-varying so I attempted to repeat the function over rolling windows.

y = repro
t = 250
library(zoo)
z = rollapplyr(y, t, function(y) beta_hats(y))
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
&#39;data&#39; must be of a vector type, was &#39;NULL&#39;

At this point the function no longer works. Can anyone help me solve this issue?

Additional information:

Not adding the data.frame() specification to the reproducible example
produces the same error already at the function for the entire data
set
Since the reproducible example is completely random the function
might produce a much higher value if you decide to use real market
returns to reproduce the error
class() of data set returns "tbl_df", "tbl", "data.frame"

答案1

得分: 1

这是一个当你想对一个NULL对象进行Cholesky分解时产生的错误：

chol(NULL)
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 
  'data' must be of a vector type, was 'NULL'

这表明问题出现在数据而不是rollapply函数内部。尝试重新生成数据，然后再次在数据上调用你的函数。系数估计的渐近理论方差矩阵似乎为NULL。请注意，它们是在提供order大于0的情况下给出的。

例如：

set.seed(1)
repro = data.frame(a=rnorm(3000, 0.0003563425, 0.0216025))
ar(repro$a, aic =TRUE)$order
[1] 0

由于order为0，因此step1中的这个数据集的渐近理论方差将为NULL：

ar(repro$a, aic =TRUE)$asy.var.coef
[1] NULL

因此，你的函数的step3将引发你遇到的错误。你需要在一个有效的数据集上运行你的函数。

还要注意，虽然该函数可能在完整数据集中不会引发错误，但如果你使用子集，由于上述原因，它可能最终引发错误。

英文:

This is an error produced when you want to carry out a cholesky decomposition of a NULL object:

chol(NULL)
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 
  &#39;data&#39; must be of a vector type, was &#39;NULL&#39;

This shows that the problem lies within your data rather than in the rollapply function. try regenerating the data and call your function on the data again. The asymptotic-theory variance matrix of the coefficient estimates seems to be NULL. note that they are given provided order>0

Eg:

set.seed(1)
repro = data.frame(a=rnorm(3000, 0.0003563425, 0.0216025))
ar(repro$a, aic =TRUE)$order
[1] 0

Since the order is 0, the assymptotic theory variance for this dataset from step1 will be NULL:

 ar(repro$a, aic =TRUE)$asy.var.coef
 [1] NULL

hence step3 of your function will throw the error you have. You need to run your function in a valid dataset.

Also note that although the function might not throw an error in the full dataset, it might end up throwing an error if you use a subset due to the reasons stated above

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

应用用户定义的函数到使用zoo库的滚动窗口

问题

答案1

‘rexp(1000, 1)’ 和 ‘replicate(1000, rexp(1,1))’ 在R中有什么区别？

如何将一个包含非ASCII Unicode字符的字符类长向量转换为它们的转义版本？

Create a new column in the dataframe using the column names as its values.

什么原因导致我的ggplot2箱线图在使用width参数时变成细线而不是宽箱子？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。