在数据框中的循环中将box.test的结果添加到新列中。

huangapple go评论102阅读模式
英文:

Add results of box.test within a loop in a new column of dataframe

问题

我有一个包含许多时间序列的数据集。我想使用Box检验来检查每个序列是否具有平稳性。我的循环用于进行检验,但如何将结果(x²和p值)导出到现有的数据框(每个时间序列作为行)作为新列呢?

这是我的数据框示例:

  1. a <- c(0.2569, 0.0145896, 0.0369, 0.025986, 0.12569, 0.3695)
  2. b <- c(0.125, 0.04582, 0.2569, 0.256369, 0.25698, 0.1456)
  3. c <- c(0.2584, 0.05698, 0.1258, 0.2569, 0.098563, 0.1569)
  4. df <- data.frame(a, b, c)

以下是我的循环,它可以成功为每个时间序列提供x²和p值:

  1. for(i in 1:ncol(df)) {
  2. box <- Box.test(df[, i], type = "Ljung-Box")
  3. print(box)
  4. }

现在结果应该被传递到一个数据框的空列中,如下所示:

  1. d <- c("series1", "series2", "series3")
  2. e <- c("green", "black", "red")
  3. f <- c(18, 24, 12)
  4. p_value <- NA # 创建一个空列
  5. x <- NA
  6. df2 <- data.frame(d, e, f, p_value, x)

我的第一个想法是这样的:

  1. df2$p_value <- box$p.value

但在这里,每一行都会得到相同的p值。

我认为,我需要使用一个新的循环,但是我不知道如何实现"df2[i,] <- df2[i,]":

  1. for(i in 1:nrow(df2)) {
  2. df2$p_value <- box$p.value(df2[i,] <- df2[i,])
  3. }

这并不起作用。有人可以帮助我吗?也许使用另一个函数?

英文:

I have a data set with many time series. I would like to check each series for stationarity using the box test. My loop works fine for the test, but how can I export the results (x² and p-value) to an existing dataframe (each time series as rows) as new column?

Here is my dataframe example:

  1. a &lt;- c(0.2569, 0.0145896, 0.0369, 0.025986, 0.12569, 0.3695)
  2. b &lt;- c(0.125, 0.04582, 0.2569, 0.256369, 0.25698, 0.1456)
  3. c &lt;- c(0.2584, 0.05698, 0.1258, 0.2569, 0.098563, 0.1569)
  4. df &lt;- data.frame(a,b,c)

Here the loop, it works fine and give me for every time series x² und p-value:

  1. for(i in 1:ncol(df)) {
  2. box &lt;- Box.test(df[ , i] &lt;- df[ , i], type = &quot;Ljung-Box&quot;)
  3. print(box)
  4. }

Now the results should be transferred to the empty columns in a dataframe like this:

  1. d &lt;- c(&quot;series1&quot;, &quot;series2&quot;, &quot;series3&quot;)
  2. e &lt;- c(&quot;green&quot;, &quot;black&quot;, &quot;red&quot;)
  3. f &lt;- c(18, 24, 12)
  4. p_value &lt;- NA #to create an empty column
  5. x &lt;- NA
  6. df2 &lt;- data.frame(d,e,f,p_value,x)

My first idea was this:

  1. df2$p_value &lt;- box$p.value

But here I get in each row the same p_value.

I think, I have to do it with a new loop, but here I dont now how to implement "df2[i , ] <- df2[i ,]" it:

  1. for(i in 1:nrow(df2)) {
  2. df2$p_value &lt;- box$p.value(df2[i , ] &lt;- df2[i ,])
  3. }

This doesn't work.
Can somebody help me? Maybe with another function?

答案1

得分: 1

你可以使用 sapply 和子集来获取 p 值。

  1. bres <- sapply(df, function(x) Box.test(x, type="Ljung-Box")[['p.value']])

我强烈建议明确地创建一个字典 a 以避免错误。

  1. a <- setNames(c("series1", "series2", "series3"), c('a', 'b', 'c'))

然后使用 cbind

  1. cbind(df2, p_value=bres[match(df2$d, a)])
  2. # d e f p_value
  3. # a series1 green 18 0.8206314
  4. # b series2 black 24 0.6379121
  5. # c series3 red 12 0.1574567

数据:

  1. df <- structure(list(a = c(0.2569, 0.0145896, 0.0369, 0.025986, 0.12569,
  2. 0.3695), b = c(0.125, 0.04582, 0.2569, 0.256369, 0.25698, 0.1456
  3. ), c = c(0.2584, 0.05698, 0.1258, 0.2569, 0.098563, 0.1569)), row.names = c(NA,
  4. -6L), class = "data.frame")
  5. df2 <- data.frame(d=c("series1", "series2", "series3"),
  6. e=c("green", "black", "red"),
  7. f=c(18, 24, 12))
英文:

You can use sapply and subset for the p value.

  1. bres &lt;- sapply(df, \(x) Box.test(x, type=&quot;Ljung-Box&quot;)[[&#39;p.value&#39;]])

I strongly recommend to explicitly formulate a dictionary a to avoid mistakes.

  1. a &lt;- setNames(c(&quot;series1&quot;, &quot;series2&quot;, &quot;series3&quot;), c(&#39;a&#39;, &#39;b&#39;, &#39;c&#39;))

Then cbind.

  1. cbind(df2, p_value=bres[match(df2$d, a)])
  2. # d e f p_value
  3. # a series1 green 18 0.8206314
  4. # b series2 black 24 0.6379121
  5. # c series3 red 12 0.1574567

Data:

  1. df &lt;- structure(list(a = c(0.2569, 0.0145896, 0.0369, 0.025986, 0.12569,
  2. 0.3695), b = c(0.125, 0.04582, 0.2569, 0.256369, 0.25698, 0.1456
  3. ), c = c(0.2584, 0.05698, 0.1258, 0.2569, 0.098563, 0.1569)), row.names = c(NA,
  4. -6L), class = &quot;data.frame&quot;)
  5. df2 &lt;- data.frame(d=c(&quot;series1&quot;, &quot;series2&quot;, &quot;series3&quot;),
  6. e=c(&quot;green&quot;, &quot;black&quot;, &quot;red&quot;),
  7. f=c(18, 24, 12))

huangapple
  • 本文由 发表于 2023年3月7日 21:58:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/75662939.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定