计算移动窗口的线性回归斜率

huangapple go评论65阅读模式
英文:

Calculate linear regression slope with moving window

问题

我需要计算每5行数据的斜率,然后跳过1行数据,再计算下5行数据的斜率,如此循环。如果数据集的末尾不足5行数据,则将斜率设置为NA或不计算。

mtcars[c(1:5),c(1,3)] %>%
  mutate(slope = lm(c(mpg) ~ c(disp))$coefficients[[2]])

我需要计算接下来5行数据的斜率,间隔为1行:

mtcars[c(2:6),c(1,3)] %>%
  mutate(slope = lm(c(mpg) ~ c(disp))$coefficients[[2]])
英文:

I have a dataframe and i need to automate the slope calculation by lm each 5 rows and jump 1 point and calculate slope with next 5 points and again jump 1 point and calculate slope with 5 points. And if the end of datafra when you not have 5 points put NA or no calculate slope

mtcars[c(1:5),c(1,3)] %>%
mutate(slope = lm(c(mpg) ~ c(disp))$coefficients[[2]])

I need to calculate slope to next 5 points with gap = 1

mtcars[c(2:6),c(1,3)] %>%
mutate(slope = lm(c(mpg) ~ c(disp))$coefficients[[2]])

答案1

得分: 0

使用map来迭代每一行的行号:

# 玩具数据集
data <- tibble(
        x = 1:50,
        y = rnorm(50))

data %>%
        # 对于每一行,获取独立变量的系数(即x的系数)
        mutate(model = c(map(1:(nrow(data) - 4), ~lm(y ~ x, data = data[.x:(.x + 4), ])$coefficients[2]), NA, NA, NA, NA) %>% unlist())

# 一个tibble:50 × 3
       x       y   model
   <int>   <dbl>   <dbl>
 1     1 -2.52    0.775 
 2     2 -0.455   0.370 
 3     3 -0.914   0.333 
 4     4 -1.14    0.221 
 5     5  1.70   -0.214 
 6     6  0.0893  0.0863
 7     7  0.138  -0.305 
 8     8  0.748  -0.329 
 9     9  0.298  -0.239 
10    10  0.441   0.274 
# ℹ 还有40行
英文:

Using map to iterate over each row number:

# toy dataset
data &lt;- tibble(
        x = 1:50,
        y = rnorm(50))

data %&gt;% 
        # for each row, get the coefficient of the independent variable (i.e the coefficient of x)
        mutate(model = c(map(1:(nrow(data) - 4), ~lm(y ~ x, data = data[.x:(.x + 4), ])$coefficients[2]), NA, NA, NA, NA) %&gt;% unlist())

# A tibble: 50 &#215; 3
       x       y   model
   &lt;int&gt;   &lt;dbl&gt;   &lt;dbl&gt;
 1     1 -2.52    0.775 
 2     2 -0.455   0.370 
 3     3 -0.914   0.333 
 4     4 -1.14    0.221 
 5     5  1.70   -0.214 
 6     6  0.0893  0.0863
 7     7  0.138  -0.305 
 8     8  0.748  -0.329 
 9     9  0.298  -0.239 
10    10  0.441   0.274 
# ℹ 40 more rows

答案2

得分: 0

rollapply会在指定大小(5)的滑动窗口上计算任意函数。我们定义斜率如所示,或者可以选择使用被注释掉的定义,它等效但速度较慢。

下面的代码提供了一个居中对齐的窗口,因此两端各有2行NA值,但rollapplyr在末尾加上r可以用于指定右对齐的窗口,在这种情况下,它们都将位于开头。如果需要,rollapply的align="left"参数可用于指定左对齐。

library(dplyr)
library(zoo)

# slope <- function(x) coef(lm(x[, 2] ~ x[, 1]))[[2]]
slope <- function(x) cov(x[, 1], x[, 2]) / var(x[, 1])

mtcars %>%
  mutate(slope = rollapply(cbind(disp, mpg), 5, slope, by.column=FALSE, fill=NA))
英文:

rollapply will compute an arbitrary function on a moving window of indicated szie (5). We define slope as shown or optionally use the commented out definition which is equivalent but slower.

The code below gives a center aligned window so there are 2 NA rows on either end but rollapplyr with an r on the end can be used to specify a right aligned window in which case they will all be at the beginning. The align="left" argument of rollapply can be used to specify left alignment if that is wanted.

library(dplyr)
library(zoo)

# slope &lt;- function(x) coef(lm(x[, 2] ~ x[, 1]))[[2]]
slope &lt;- function(x) cov(x[, 1], x[, 2]) / var(x[, 1])

mtcars %&gt;%
  mutate(slope = rollapply(cbind(disp, mpg), 5, slope, by.column=FALSE, fill=NA))

huangapple
  • 本文由 发表于 2023年7月3日 16:58:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76603287.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定