计算移动窗口的线性回归斜率

huangapple go评论96阅读模式
英文:

Calculate linear regression slope with moving window

问题

我需要计算每5行数据的斜率,然后跳过1行数据,再计算下5行数据的斜率,如此循环。如果数据集的末尾不足5行数据,则将斜率设置为NA或不计算。

  1. mtcars[c(1:5),c(1,3)] %>%
  2. mutate(slope = lm(c(mpg) ~ c(disp))$coefficients[[2]])

我需要计算接下来5行数据的斜率,间隔为1行:

  1. mtcars[c(2:6),c(1,3)] %>%
  2. mutate(slope = lm(c(mpg) ~ c(disp))$coefficients[[2]])
英文:

I have a dataframe and i need to automate the slope calculation by lm each 5 rows and jump 1 point and calculate slope with next 5 points and again jump 1 point and calculate slope with 5 points. And if the end of datafra when you not have 5 points put NA or no calculate slope

mtcars[c(1:5),c(1,3)] %>%
mutate(slope = lm(c(mpg) ~ c(disp))$coefficients[[2]])

I need to calculate slope to next 5 points with gap = 1

mtcars[c(2:6),c(1,3)] %>%
mutate(slope = lm(c(mpg) ~ c(disp))$coefficients[[2]])

答案1

得分: 0

使用map来迭代每一行的行号:

  1. # 玩具数据集
  2. data <- tibble(
  3. x = 1:50,
  4. y = rnorm(50))
  5. data %>%
  6. # 对于每一行,获取独立变量的系数(即x的系数)
  7. mutate(model = c(map(1:(nrow(data) - 4), ~lm(y ~ x, data = data[.x:(.x + 4), ])$coefficients[2]), NA, NA, NA, NA) %>% unlist())
  8. # 一个tibble:50 × 3
  9. x y model
  10. <int> <dbl> <dbl>
  11. 1 1 -2.52 0.775
  12. 2 2 -0.455 0.370
  13. 3 3 -0.914 0.333
  14. 4 4 -1.14 0.221
  15. 5 5 1.70 -0.214
  16. 6 6 0.0893 0.0863
  17. 7 7 0.138 -0.305
  18. 8 8 0.748 -0.329
  19. 9 9 0.298 -0.239
  20. 10 10 0.441 0.274
  21. # ℹ 还有40行
英文:

Using map to iterate over each row number:

  1. # toy dataset
  2. data &lt;- tibble(
  3. x = 1:50,
  4. y = rnorm(50))
  5. data %&gt;%
  6. # for each row, get the coefficient of the independent variable (i.e the coefficient of x)
  7. mutate(model = c(map(1:(nrow(data) - 4), ~lm(y ~ x, data = data[.x:(.x + 4), ])$coefficients[2]), NA, NA, NA, NA) %&gt;% unlist())
  8. # A tibble: 50 &#215; 3
  9. x y model
  10. &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt;
  11. 1 1 -2.52 0.775
  12. 2 2 -0.455 0.370
  13. 3 3 -0.914 0.333
  14. 4 4 -1.14 0.221
  15. 5 5 1.70 -0.214
  16. 6 6 0.0893 0.0863
  17. 7 7 0.138 -0.305
  18. 8 8 0.748 -0.329
  19. 9 9 0.298 -0.239
  20. 10 10 0.441 0.274
  21. # ℹ 40 more rows

答案2

得分: 0

  1. rollapply会在指定大小(5)的滑动窗口上计算任意函数。我们定义斜率如所示,或者可以选择使用被注释掉的定义,它等效但速度较慢。
  2. 下面的代码提供了一个居中对齐的窗口,因此两端各有2NA值,但rollapplyr在末尾加上r可以用于指定右对齐的窗口,在这种情况下,它们都将位于开头。如果需要,rollapplyalign="left"参数可用于指定左对齐。
  3. library(dplyr)
  4. library(zoo)
  5. # slope <- function(x) coef(lm(x[, 2] ~ x[, 1]))[[2]]
  6. slope <- function(x) cov(x[, 1], x[, 2]) / var(x[, 1])
  7. mtcars %>%
  8. mutate(slope = rollapply(cbind(disp, mpg), 5, slope, by.column=FALSE, fill=NA))
英文:

rollapply will compute an arbitrary function on a moving window of indicated szie (5). We define slope as shown or optionally use the commented out definition which is equivalent but slower.

The code below gives a center aligned window so there are 2 NA rows on either end but rollapplyr with an r on the end can be used to specify a right aligned window in which case they will all be at the beginning. The align="left" argument of rollapply can be used to specify left alignment if that is wanted.

  1. library(dplyr)
  2. library(zoo)
  3. # slope &lt;- function(x) coef(lm(x[, 2] ~ x[, 1]))[[2]]
  4. slope &lt;- function(x) cov(x[, 1], x[, 2]) / var(x[, 1])
  5. mtcars %&gt;%
  6. mutate(slope = rollapply(cbind(disp, mpg), 5, slope, by.column=FALSE, fill=NA))

huangapple
  • 本文由 发表于 2023年7月3日 16:58:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76603287.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定