“variable lengths differ” error while running regressions in a loop

huangapple go评论88阅读模式

"variable lengths differ" error while running regressions in a loop




y <- c(1653:2592) # 响应变量
x1 <- c("bmi","Age", "sex","lpa2c", "smoking") # 预测变量

for (i in x1){ 
  model <- lm(paste("y ~", i), data = QBB_clean) 


Error in model.frame.default(formula = paste("y ~", i), data = QBB_clean) :
variable lengths differ (found for 'bmi').



I am trying to run a regression loop based on code that I have found in a previous answer (https://stackoverflow.com/q/27952653/21208453) but I keep getting an error. My outcomes (dependent) are 940 variables (metabolites) and my exposure (independent) are "bmi","Age", "sex","lpa2c", and "smoking". where BMI and Age are continuous. BMI is the mean exposure, and for others, I am controlling for them.
So I'm testing the effect of BMI on 940 metabolites.
Also, I would like to know how I can extract coefficient, p-value, standard error, and confidence interval for BMI only and when it is significant.

This is the code I have used:

y&lt;- c(1653:2592) # response 
x1&lt;- c(&quot;bmi&quot;,&quot;Age&quot;, &quot;sex&quot;,&quot;lpa2c&quot;, &quot;smoking&quot;) # predictor 

for (i in x1){ 
  model &lt;- lm(paste(&quot;y ~&quot;, i[[1]]), data= QBB_clean) 

And this is the error:

> Error in model.frame.default(formula = paste("y ~", i[[1]]), data = QBB_clean, :
variable lengths differ (found for 'bmi').

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-html -->

              y1         y2          y3          y4 bmi age sex       lpa2c smoking
1   0.2875775201 0.59998896 0.238726027 0.784575267  24  18   1 0.470681834       1
2   0.7883051354 0.33282354 0.962358936 0.009429905  12  20   0 0.365845473       1
3   0.4089769218 0.48861303 0.601365726 0.779065883  18  15   0 0.121272054       0
4   0.8830174040 0.95447383 0.515029727 0.729390652  16  21   0 0.046993681       0
5   0.9404672843 0.48290240 0.402573342 0.630131853  18  28   1 0.262796304       1
6   0.0455564994 0.89035022 0.880246541 0.480910830  13  13   0 0.968641168       1
7   0.5281054880 0.91443819 0.364091865 0.156636851  11  12   0 0.488495482       1
8   0.8924190444 0.60873498 0.288239281 0.008215520  21  23   0 0.477822030       0
9   0.5514350145 0.41068978 0.170645235 0.452458394  18  17   1 0.748792881       0
10  0.4566147353 0.14709469 0.172171746 0.492293329  20  15   1 0.667640231       1

<!-- end snippet -->


得分: 1


respvars <- names(QBB_clean[1653:2592]) 
predvars <- c("bmi","Age", "sex","lpa2c", "smoking")
results <- list()
for (v in respvars) { 
  form <- reformulate(predvars, response = v)
  results[[v]] <- lm(form, data = QBB_clean)

然后,您可以使用类似 lapply(results, summary) 的方式打印结果,提取系数等等。(我稍微有点难以理解只是打印 940 次回归结果会有多大用处...您真的打算检查它们吗?)

如果您想要BMI的系数等信息,我 认为 以下代码应该可以工作(未经测试):

t(sapply(results, function(m) coef(summary(m))["bmi",]))


t(sapply(results, function(m) confint(m)["bmi",]))

If you want to loop over responses you will want something like this:

respvars &lt;- names(QBB_clean[1653:2592]) 
predvars &lt;- c(&quot;bmi&quot;,&quot;Age&quot;, &quot;sex&quot;,&quot;lpa2c&quot;, &quot;smoking&quot;)
results &lt;- list()
for (v in respvars) { 
  form &lt;- reformulate(predvars, response = v)
  results[[v]] &lt;- lm(form, data = QBB_clean)

You can then print the results with something like lapply(results, summary), extract coefficients, etc.. (I have a little trouble seeing how it's going to be useful to just print the results of 940 regressions ... are you really going to inspect them all?

If you want coefficients etc. for BMI, I think this should work (not tested):

t(sapply(results, function(m) coef(summary(m))[&quot;bmi&quot;,]))

Or for coefficients:

t(sapply(results, function(m) confint(m)[&quot;bmi&quot;,]))

  • 本文由 发表于 2023年2月14日 21:20:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/75448449.html



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
