如何在R中获得精确极低的P-值(以对数刻度或其他方式)

huangapple go评论68阅读模式
英文:

How to get exact extremely low P-value in R (on log scale or otherwise)

问题

我正在在R中运行一个线性回归模型,而且关联性非常显著- P值远远低于标准的2.2e-16。

回归中的t-统计值为-44.85
自由度为33689。

是否有办法获得确切的P值,甚至-log10(P)?我尝试了一些不同的方法-首先,Rmpfr包对学生T分布支持不好。

此外,还有这个帖子:https://stackoverflow.com/questions/11328784/decimal-points-probability-value-of-0-in-language-r,但是排名第一的评论答案不太对。假设我们使用该帖子中的示例:

d <- data.frame(x=rep(1:5,each=10))
set.seed(101)
d$y <- rnorm(50,mean=d$x,sd=0.0001)
lm1 <- lm(y~x,data=d)

coef(summary(lm1))中,我们可以看到P = 9.690173e-203

答案说,我们可以从以下代码获得log(P)(这正是我想要的):

tval <- coef(summary(lm1))["x","t value"]
2*pt(abs(tval),df=48,lower.tail=FALSE,log.p=TRUE)/log(10)

这会得到-404.6294。这个值不是log(P)。如果我们计算-log10(9.690173e-203),答案是202.0137;这不等于404.6294或404.6294/2 = 202.3147。

有没有解决方法?非常感谢您的帮助。谢谢!

英文:

I am running a linear regression model in R, and the association is very significant- the P-value is much lower than the standard 2.2e-16.

The t-statistic from the regression is -44.85
and the degrees of freedom are 33689.

Is there a way to get the exact P-value, or even the -log10(P)? I tried a few different things- first, the package Rmpfr does not have good support for the student's T distribution.

Also, there is this post: https://stackoverflow.com/questions/11328784/decimal-points-probability-value-of-0-in-language-r, but the answer in the top-voted comment is not quite right. Let's say we use the given example in that post:

d <- data.frame(x=rep(1:5,each=10))
set.seed(101)
d$y <- rnorm(50,mean=d$x,sd=0.0001)
lm1 <- lm(y~x,data=d)

From coef(summary(lm1)) we can see that P = 9.690173e-203

The answer says we can get a log(P) (which is what I want) from:

tval <- coef(summary(lm1))["x","t value"]
2*pt(abs(tval),df=48,lower.tail=FALSE,log.p=TRUE)/log(10)

Which gives -404.6294. This value is not the log(P). If we do -log10(9.690173e-203), the answer is 202.0137; which does not equal 404.6294 or 404.6294/2 = 202.3147.

Is there a workaround for this? Greatly appreciate your help. Thanks!

答案1

得分: 1

以下是翻译好的部分:

你可以使用:

log(2) + pt(coef(summary(lm1))["x","t value"] , 48, lower.tail = FALSE, log.p = TRUE)
[1] -465.1537

这与p值的对数完全相同:

log(coef(summary(lm1))[2,4])
[1] -465.1537

英文:

You could use:

log(2) + pt(coef(summary(lm1))["x","t value"] , 48, lower.tail = FALSE, log.p = TRUE) 
[1] -465.1537

This is exactly similar to the log of the p-value:

log(coef(summary(lm1))[2,4])
[1] -465.1537

huangapple
  • 本文由 发表于 2023年4月1日 01:07:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75901059.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定