2023年5月15日 14:41:00go评论95阅读模式

英文:

Running a post-hoc test on a random effect model with two dummy variables in R

问题

你的R代码中有一些HTML转义字符，我将提供一个没有这些转义字符的翻译版本：

# 在R中有一个数据框 - 我的代码如下：
library(lme4)
library(lmerTest)
library(multcomp)
# 创建数据框
df <- data.frame(
  col1 = rep(1:3, each = 3),
  col2 = rep(c("A", "B", "C"), times = 3),
  col3 = rnorm(9)
)
df$A <- ifelse(df$col2 == "A", 1, 0)
df$B <- ifelse(df$col2 == "B", 1, 0)
# 将 "col1" 转换为因子，因为它是随机效应
df$col1 <- as.factor(df$col1)
# 运行这个模型：
model <- lmer(col3 ~ A + B + (1 | col1),
                   data = df)
summary(model)
# 这将给出以下输出：
# 线性混合模型拟合（REML）。t检验使用Satterthwaite's方法['lmerModLmerTest']
# 公式：col3 ~ A + B + (1 | col1)
# 数据：df
# 收敛时的REML标准：23.3
# 标准化残差：
#     Min      1Q  Median      3Q     Max 
# -1.3966 -0.3950  0.2740  0.4242  1.1226 
# 随机效应：
#  组     名称         方差     标准差
# col1  (Intercept)  0.000    0.000   
# 残差                 1.636    1.279   
# 观测数：9，组：col1，3
# 固定效应：
#             估计值  标准误  df  t值  p值
# (Intercept)  1.0363  0.7386  6.0000  1.403  0.210
# A           -1.3912  1.0445  6.0000 -1.332  0.231
# B           -1.2642  1.0445  6.0000 -1.210  0.272
# 固定效应的相关性：
#   (Intr) A     
# A -0.707       
# B -0.707  0.500
# 优化器（nloptwrap）收敛代码：0（OK）
# 边界（奇异）拟合：参见help('isSingular')
# 现在我想运行事后检验，我尝试使用[这里](https://stats.stackexchange.com/questions/237512/how-to-perform-post-hoc-test-on-lmer-model/237513#237513)给出的代码：
summary(glht(model, linfct = mcp(col1 = "Tukey")), test = adjusted("holm"))
# 我得到的错误是：
# 错误在于评估参数 'object' 以选择函数 'summary' 的方法时：已在 'linfct' 中指定了变量 'col1'，但在 'model' 中找不到它！

希望这有所帮助！如果有其他问题，请随时提出。

英文:

I have a data frame in R - my code is as follows:

library(lme4)
library(lmerTest)
library(multcomp)
#Create DF
df &lt;- data.frame(
  col1 = rep(1:3, each = 3),
  col2 = rep(c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;), times = 3),
  col3 = rnorm(9)
)
df$A &lt;- ifelse(df$col2 == &quot;A&quot;, 1, 0)
df$B &lt;- ifelse(df$col2 == &quot;B&quot;, 1, 0)
#Make &quot;col1&quot; into factor, as it&#39;s the random effect
df$col1 &lt;- as.factor(df$col1)

I've run this model on it:

model &lt;- lmer(col3 ~ A + B + (1 | col1),
                   data = df)
summary(model)

Which gives me this output:

Linear mixed model fit by REML. t-tests use Satterthwaite&#39;s method [&#39;lmerModLmerTest&#39;]
Formula: col3 ~ A + B + (1 | col1)
   Data: df
REML criterion at convergence: 23.3
Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.3966 -0.3950  0.2740  0.4242  1.1226 
Random effects:
 Groups   Name        Variance Std.Dev.
 col1     (Intercept) 0.000    0.000   
 Residual             1.636    1.279   
Number of obs: 9, groups:  col1, 3
Fixed effects:
            Estimate Std. Error      df t value Pr(&gt;|t|)
(Intercept)   1.0363     0.7386  6.0000   1.403    0.210
A            -1.3912     1.0445  6.0000  -1.332    0.231
B            -1.2642     1.0445  6.0000  -1.210    0.272
Correlation of Fixed Effects:
  (Intr) A     
A -0.707       
B -0.707  0.500
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help(&#39;isSingular&#39;)```

Now I'd like to run a post-hoc test, I tried using the code given here

It looks like this:

summary(glht(model, linfct = mcp(col1 = &quot;Tukey&quot;)), test = adjusted(&quot;holm&quot;))

The error I get is this:

Error in h(simpleError(msg, call)) : 
  error in evaluating the argument &#39;object&#39; in selecting a method for function &#39;summary&#39;: Variable(s) ‘col1’ have been specified in ‘linfct’ but cannot be found in ‘model’!

Would love help.

TIA!

*In response to a comment by Roland - I'd like to compare between A, B, C - while taking the random effect into consideration.

The mixed effect model compares A vs. C and B vs. C, but I want to have a B vs. C comparison while the random effect is still taken into consideration.

I tried running a permutation test with random effect but I can't seem to find code that works for that, so I'm left with post-hoc. Hope this clears it up!

答案1

得分: 0

I'm don't think glht, can do posthocs tests with a random variable.

我认为 glht 不能对随机变量进行事后检验。

I'm also under the impression this is not an oversight from the developers. Statistically, random variables are used for controlling for a categorical variable that we know has an important effect but we could not measure all categories. For example, imagine that we are measuring the effect of a drug on an animal's insulin production, and we took multiple samples of the same animal. Naturally, samples from the same animals are likely to be more similar, so the "animal identity" is an important factor to consider. Yet, we want to make inferences about the entire population of animals, but we cannot sample all individuals to properly account for their effect. So we use random effects to try to gleam the variance between individuals and use that as a proxy for different animal's identities.

我也认为这不是开发者的疏忽。从统计学角度来看，随机变量用于控制我们知道具有重要影响但无法测量所有类别的分类变量。例如，想象一下，我们正在测量药物对动物胰岛素产生的影响，并且我们对同一动物进行了多次采样。自然地，来自相同动物的样本可能更相似，因此“动物身份”是需要考虑的重要因素。然而，我们希望对整个动物群体进行推断，但我们无法对所有个体进行适当采样以正确考虑它们的影响。因此，我们使用随机效应来尝试了解个体之间的方差，并将其视为不同动物身份的代理。

With that reasoning, you can see that post-hoc on random effects makes little statistical sense. If we are doing a post-hoc, it implies that we care about the difference between the specific treatments we measure. They are not mere samples of a bigger set of treatments that we aim to statistically control and remove from our inference. If we care about the difference between treatments A and B then they are, by design, fixed factors.

基于这个理由，你可以看到在随机效应上进行事后检验在统计学上几乎没有意义。如果我们在进行事后检验，这意味着我们关心我们测量的具体治疗方法之间的差异。它们不仅仅是我们旨在在推断中进行统计控制和排除的更大一组治疗方法的样本。如果我们关心治疗方法 A 和 B 之间的差异，那么它们从设计上来说是固定因素。

DISCLAIMER: this interpretation is of course, not the only one. People often use AIC to test for the significance of random variables, especially in personality studies. So random variables can be used as more than covariates in some cases. But, AFAIK, even in this use it is assumed that the difference between current sampled levels is not as important and is a simple representation of population differences.

免责声明：当然，这种解释并不是唯一的解释。人们经常使用 AIC 来测试随机变量的重要性，尤其是在个性研究中。因此，在某些情况下，随机变量可以被用作不仅仅是协变量。但据我所知，即使在这种情况下，也假定当前采样水平之间的差异并不那么重要，只是对种群差异的简单代表。

英文:

I'm don't think glht, can do posthocs tests with a random variable.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中对具有两个虚拟变量的随机效应模型进行事后检验。

问题

答案1

在R中创建完全随机设计（CRD）布局时出现错误。

如何使循环中的日期结果相关？

将按行添加的值应用于单行变量，同时保留其他变量和行。

我可以理解你的要求，将在代码部分以外的文本进行翻译。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。