英文:
How to name a term created in the formula when calling `lm()`?
问题
可以在公式中创建一个项并为其命名吗?是的,可以如下方式实现:
out4 <- lm(y ~ new_term = relevel(factor(x), ref = "C"), dat)
summary(out4)
#>
#> Call:
#> lm(formula = y ~ new_term = relevel(factor(x), ref = "C"), data = dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.07296 -0.52161 -0.03713 0.53898 2.12497
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 2.6551 0.1594 16.653 < 2e-16 ***
#> new_term = relevel(factor(x), ref = "C")A -0.5413 0.2350 -2.303 0.0234 *
#> new_term = relevel(factor(x), ref = "C")B 1.1359 0.2209 5.143 1.41e-06 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.9297 on 97 degrees of freedom
#> Multiple R-squared: 0.3703, Adjusted R-squared: 0.3573
#> F-statistic: 28.52 on 2 and 97 DF, p-value: 1.808e-10
在这个示例中,我们在公式中为新创建的项命名为 new_term
。这样就可以同时创建项并为其命名。
英文:
Is it possible to name a term created in a formula? This is the scenario:
Create a toy dataset:
set.seed(67253)
n <- 100
x <- sample(c("A", "B", "C"), size = n, replace = TRUE)
y <- sapply(x, switch, A = 0, B = 2, C = 1) + rnorm(n, 2)
dat <- data.frame(x, y)
head(dat)
#> x y
#> 1 B 4.5014474
#> 2 C 4.0252796
#> 3 C 2.4958761
#> 4 C 0.6725571
#> 5 B 4.3364206
#> 6 C 3.9798909
Fit a regression model:
out <- lm(y ~ x, dat)
summary(out)
#>
#> Call:
#> lm(formula = y ~ x, data = dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.07296 -0.52161 -0.03713 0.53898 2.12497
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 2.1138 0.1726 12.244 < 2e-16 ***
#> xB 1.6772 0.2306 7.274 9.04e-11 ***
#> xC 0.5413 0.2350 2.303 0.0234 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.9297 on 97 degrees of freedom
#> Multiple R-squared: 0.3703, Adjusted R-squared: 0.3573
#> F-statistic: 28.52 on 2 and 97 DF, p-value: 1.808e-10
Fit the model again, but use "C"
as the reference group:
out2 <- lm(y ~ relevel(factor(x), ref = "C"), dat)
summary(out2)
#>
#> Call:
#> lm(formula = y ~ relevel(factor(x), ref = "C"), data = dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.07296 -0.52161 -0.03713 0.53898 2.12497
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 2.6551 0.1594 16.653 < 2e-16 ***
#> relevel(factor(x), ref = "C")A -0.5413 0.2350 -2.303 0.0234 *
#> relevel(factor(x), ref = "C")B 1.1359 0.2209 5.143 1.41e-06 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.9297 on 97 degrees of freedom
#> Multiple R-squared: 0.3703, Adjusted R-squared: 0.3573
#> F-statistic: 28.52 on 2 and 97 DF, p-value: 1.808e-10
The variable, x
, was re-leveled in the second call to lm()
. This is done in the formula and so the name of this term is relevel(factor(x), ref = "C")
.
Certainly, we can create the term before calling lm()
, e.g.:
dat$x2 <- relevel(factor(x), ref = "C")
out3 <- lm(y ~ x2, dat)
summary(out3)
#>
#> Call:
#> lm(formula = y ~ x2, data = dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.07296 -0.52161 -0.03713 0.53898 2.12497
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 2.6551 0.1594 16.653 < 2e-16 ***
#> x2A -0.5413 0.2350 -2.303 0.0234 *
#> x2B 1.1359 0.2209 5.143 1.41e-06 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.9297 on 97 degrees of freedom
#> Multiple R-squared: 0.3703, Adjusted R-squared: 0.3573
#> F-statistic: 28.52 on 2 and 97 DF, p-value: 1.808e-10
However, can I create a term and name it in the formula? If yes, how?
答案1
得分: 1
从此评论中获取的信息进行调整:https://stackoverflow.com/questions/26870664/rename-model-terms-in-lm-object-for-forecasting#comment42302348_26870664
set.seed(67253)
n <- 100
x <- sample(c("A", "B", "C"), size = n, replace = TRUE)
y <- sapply(x, switch, A = 0, B = 2, C = 1) + rnorm(n, 2)
dat <- data.frame(x, y)
out <- lm(y ~ x, dat)
summary(out)
out2 <- lm(y ~ x2, transform(dat,
x2=relevel(factor(x), ref = "C")))
summary(out2)
<details>
<summary>英文:</summary>
adapted from the info in this comment : https://stackoverflow.com/questions/26870664/rename-model-terms-in-lm-object-for-forecasting#comment42302348_26870664
set.seed(67253)
n <- 100
x <- sample(c("A", "B", "C"), size = n, replace = TRUE)
y <- sapply(x, switch, A = 0, B = 2, C = 1) + rnorm(n, 2)
dat <- data.frame(x, y)
out <- lm(y ~ x, dat)
summary(out)
out2 <- lm(y ~ x2, transform(dat,
x2=relevel(factor(x), ref = "C")))
summary(out2)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论