问题

I'm calculating chi-squared goodness of fit test. There are four vegetation types (A–D), each occupies a given % of the total study area, and in each vegetation, a total number of specimens was calculated. The question is whether a distribution of this plant species is proportional to vegetation types areas or not. I ran the test in R and with an online calculator, but the results are very different, and only the online calculator returns the correct values (I know the answer).

A &lt;- c(45, 4, 10, 59) #number of specimens in each vegetation, total 118 observations
B &lt;- c(24, 17, 5, 54) #area of each vegetation = % of the total study area
C &lt;- c(28.32, 20.06, 5.9, 63.72) #expected values (area % * 118)

chisq.test(A, C)

The output

Pearson's Chi-squared test

data:  A and C
X-squared = 12, df = 9, p-value = 0.2133

Next, I rerun the test with an online calculator (https://www.statology.org/chi-square-goodness-of-fit-test-calculator/) using my observed (A) and expected (C) data, and the result is:

X2 Test Statistic: 25.880627
p-value: 0.000010

This is also the correct answer. The question is: what am I doing wrong to have these two tests run so differently?

英文:

I'm calculating chi-squared goodness of fit test. There are four vegetation types (A–D), each occupies a given % of the total study area, and in each vegetation a total number of specimens was calculated. The question is whether a distribution of a this plant species is proportional to vegetation types areas or not. I ran the test in R and with an online calculator, but the results are very different and only the online calculator returns the correct values (I know the answer).

A &lt;- c(45, 4, 10, 59) #number of specimens in each vegetation, total 118 observations
B &lt;- c(24, 17, 5, 54) #area of each vegetation = % of the total study area
C &lt;- c(28.32, 20.06, 5.9, 63.72) #expected values (area % * 118)

chisq.test(A, C)

The output

	Pearson&#39;s Chi-squared test

data:  A and C
X-squared = 12, df = 9, p-value = 0.2133

Next, I rerun the test with an online calculator (https://www.statology.org/chi-square-goodness-of-fit-test-calculator/) using my observed (A) and expected (C) data and the result is:

X2 Test Statistic: 25.880627
p-value: 0.000010

This is also the correct answer. The question is: what am I doing wrong to have these two tests run so differently?

答案1

得分: 1

chisq.test() 函数的输入不是人们所期望的。最佳方法是输入要测试的向量 x、期望概率的向量 p 以及 rescale 参数设置为 TRUE。检查"expected"结果以确认计算是否合理。

A <- c(45, 4, 10, 59) # 每种植被的样本数量，总共 118 个观测
B <- c(24, 17, 5, 54) # 每种植被的面积，占总研究区域的百分比
C <- c(28.32, 20.06, 5.9, 63.72) # 期望值（面积百分比 * 118）

chi <- chisq.test(A, p = C, rescale.p = TRUE)
print(chi)
# 给定概率的卡方检验
# 
# 数据:  A
# X-平方 = 25.881, 自由度 = 3, p-值 = 1.01e-05
chi$expected
#[1] 28.32 20.06  5.90 63.72

使用 chisq.test(A, C) 会生成一个方阵，这不是您想要的。

chi_wrong <- chisq.test(A, C)
chi_wrong$expected
#     C
# A   5.9 20.06 28.32 63.72
# 4  0.25  0.25  0.25  0.25
# 10 0.25  0.25  0.25  0.25
# 45 0.25  0.25  0.25  0.25
# 59 0.25  0.25  0.25  0.25

英文:

The input chisq.test() is not what people expect. The best way is input the vector to test, x the vector of expected probabilities, p and the rescale parameter=TRUE.
Examine the "expected" results to confirm the calculation makes sense.

A &lt;- c(45, 4, 10, 59) #number of specimens in each vegetation, total 118 observations
B &lt;- c(24, 17, 5, 54) #area of each vegetation = % of the total study area
C &lt;- c(28.32, 20.06, 5.9, 63.72) #expected values (area % * 118)

chi &lt;- chisq.test(A, p=C, rescale.p = TRUE)
print(chi)
# Chi-squared test for given probabilities
# 
# data:  A
# X-squared = 25.881, df = 3, p-value = 1.01e-05
chi$expected
#[1] 28.32 20.06  5.90 63.72

Using chisq.test(A, C) generates a square matrix which is not what you want.

chi_wrong &lt;- chisq.test(A, C)
chi_wrong$expected
#   C
# A   5.9 20.06 28.32 63.72
# 4  0.25  0.25  0.25  0.25
# 10 0.25  0.25  0.25  0.25
# 45 0.25  0.25  0.25  0.25
# 59 0.25  0.25  0.25  0.25

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

不同的卡方检验数值在R和在线计算器中

问题

答案1

Ghost CMS API 来自 R

将按行添加的值应用于单行变量，同时保留其他变量和行。

“Neuralnet”库在R中 – 混淆矩阵

返回两个数据框之间值超出一定百分比差异的反连接。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论