2023年5月25日 01:57:48go评论116阅读模式

英文:

Why does a small rounding error for np in pbinom() generate a relatively large error in the calculated p?

问题

我有一个计算二项分布概率的函数。用户可以输入一个样本概率。假设他们有35次试验中的25次成功。他们计算出概率为0.7143，并将其输入到函数中。

由于我使用了pbinom()函数，我将把该概率乘以样本大小以获得分位数。看一下当我在这里重新计算数学时的差异，首先是用计算的分位数，然后是实际计数。

> 2*pbinom(.7143*35,35,.5,lower.tail = FALSE)+dbinom(.7143*35,35,.5)
[1] 0.005988121
Warning message:
In dbinom(0.7143 * 35, 35, 0.5) : non-integer x = 25.000500
> 2*pbinom(25,35,.5,lower.tail = FALSE)+dbinom(25,35,.5)
[1] 0.01133098

尾部的这种差异可能会导致不同的结论。

这里有一个远离尾部的例子，有35次试验中的18次成功：

> 2*pbinom(.5143*35,35,.5,lower.tail = FALSE)+dbinom(.5143*35,35,.5)
[1] 0.7358788
Warning message:
In dbinom(0.5143 * 35, 35, 0.5) : non-integer x = 18.000500
> 2*pbinom(18,35,.5,lower.tail = FALSE)+dbinom(18,35,.5)
[1] 0.8679394

这是p值的13%差异。

我理解为什么会出现警告消息，我可以通过四舍五入轻松修复它，但为什么分位数中的微小误差0.0005会如此显著地影响计算出的p值呢？

英文:

I have a function to calculate probabilities for a binomial distribution. The user could enter a sample probability. Let's say they had 25 out of 35. They calculate a probability as 0.7143 and enter that into the function.

Since I am using pbinom() I'll multiply that probability by the sample size to get the quantile. Look at the differences when I reproduce the math here, first with a calculated quantile and second with the actual count.

&gt; 2*pbinom(.7143*35,35,.5,lower.tail = FALSE)+dbinom(.7143*35,35,.5)
[1] 0.005988121
Warning message:
In dbinom(0.7143 * 35, 35, 0.5) : non-integer x = 25.000500
&gt; 2*pbinom(25,35,.5,lower.tail = FALSE)+dbinom(25,35,.5)
[1] 0.01133098

A difference of that amount in the tail could lead to a different conclusion.

Here is an example away from the tails with 18 out of 35:

&gt; 2*pbinom(.5143*35,35,.5,lower.tail = FALSE)+dbinom(.5143*35,35,.5)
[1] 0.7358788
Warning message:
In dbinom(0.5143 * 35, 35, 0.5) : non-integer x = 18.000500
&gt; 2*pbinom(18,35,.5,lower.tail = FALSE)+dbinom(18,35,.5)
[1] 0.8679394

That is a difference of 13% in the p-value.

I understand why the warning message appears and I can easily fix it by rounding, but why does a tiny error of 0.0005 in the quantile affect the calculated p-value so much?

答案1

得分: 3

如果x的元素不是整数，dbinom的结果会是零，并伴有警告。这不是数值错误，而是如果你给定一个非整数给dbinom，它将始终返回0，并附带警告。因此，确保输入是整数是解决方案。

英文:

There is a comment in the documentation that directly addresses this:

If an element of x is not integer, the result of dbinom is zero, with a warning.

So it's not numerical error, per se, it's that if you give dbinom a non-integer it will always return 0, with a warning. So yes, the solution is to always check your inputs to ensure they are integers.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么在pbinom()中np的小舍入误差会导致计算的p值出现相对较大的误差？

问题

答案1

强调绘图中的线段范围 – 仅限闭区间

以计算效率为基础，操纵大型深度嵌套对象的方法？

使用矩阵操作来实现计算对数似然的特定方法。

在R中，在小数点前添加0作为填充。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。