英文:
How to get exact extremely low P-value in R (on log scale or otherwise)
问题
我正在在R中运行一个线性回归模型,而且关联性非常显著- P值远远低于标准的2.2e-16。
回归中的t-统计值为-44.85
自由度为33689。
是否有办法获得确切的P值,甚至-log10(P)?我尝试了一些不同的方法-首先,Rmpfr包对学生T分布支持不好。
此外,还有这个帖子:https://stackoverflow.com/questions/11328784/decimal-points-probability-value-of-0-in-language-r,但是排名第一的评论答案不太对。假设我们使用该帖子中的示例:
d <- data.frame(x=rep(1:5,each=10))
set.seed(101)
d$y <- rnorm(50,mean=d$x,sd=0.0001)
lm1 <- lm(y~x,data=d)
从coef(summary(lm1))
中,我们可以看到P = 9.690173e-203
答案说,我们可以从以下代码获得log(P)(这正是我想要的):
tval <- coef(summary(lm1))["x","t value"]
2*pt(abs(tval),df=48,lower.tail=FALSE,log.p=TRUE)/log(10)
这会得到-404.6294。这个值不是log(P)。如果我们计算-log10(9.690173e-203),答案是202.0137;这不等于404.6294或404.6294/2 = 202.3147。
有没有解决方法?非常感谢您的帮助。谢谢!
英文:
I am running a linear regression model in R, and the association is very significant- the P-value is much lower than the standard 2.2e-16.
The t-statistic from the regression is -44.85
and the degrees of freedom are 33689.
Is there a way to get the exact P-value, or even the -log10(P)? I tried a few different things- first, the package Rmpfr does not have good support for the student's T distribution.
Also, there is this post: https://stackoverflow.com/questions/11328784/decimal-points-probability-value-of-0-in-language-r, but the answer in the top-voted comment is not quite right. Let's say we use the given example in that post:
d <- data.frame(x=rep(1:5,each=10))
set.seed(101)
d$y <- rnorm(50,mean=d$x,sd=0.0001)
lm1 <- lm(y~x,data=d)
From coef(summary(lm1))
we can see that P = 9.690173e-203
The answer says we can get a log(P) (which is what I want) from:
tval <- coef(summary(lm1))["x","t value"]
2*pt(abs(tval),df=48,lower.tail=FALSE,log.p=TRUE)/log(10)
Which gives -404.6294. This value is not the log(P). If we do -log10(9.690173e-203), the answer is 202.0137; which does not equal 404.6294 or 404.6294/2 = 202.3147.
Is there a workaround for this? Greatly appreciate your help. Thanks!
答案1
得分: 1
以下是翻译好的部分:
你可以使用:
log(2) + pt(coef(summary(lm1))["x","t value"] , 48, lower.tail = FALSE, log.p = TRUE)
[1] -465.1537
这与p值的对数完全相同:
log(coef(summary(lm1))[2,4])
[1] -465.1537
英文:
You could use:
log(2) + pt(coef(summary(lm1))["x","t value"] , 48, lower.tail = FALSE, log.p = TRUE)
[1] -465.1537
This is exactly similar to the log of the p-value:
log(coef(summary(lm1))[2,4])
[1] -465.1537
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论