2023年2月6日 20:20:02go评论93阅读模式

英文:

Correlation Matrix Between Variables in R

问题

我一直在尝试确定面板数据中变量之间的相关性。我的数据如下（有更多的日期，一些PM10的值为NA）：

structure(list(NetC = c("Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Reti Private", "Reti Private", "Reti Private", 
"Reti Private", "Reti Private", "Reti Private"), ID = c("IT1938A", 
"IT1938A", "IT1938A", "IT2086A", "IT2086A", "IT2086A", "IT2110A", 
"IT2110A", "IT2110A", "IT1766A", "IT1766A", "IT1766A", "IT2090A", 
"IT2090A", "IT2090A"), Stat = c("Citta dei Ragazzi", "Citta dei Ragazzi", 
"Citta dei Ragazzi", "Rende", "Rende", "Rende", "Acri", "Acri", 
"Acri", "Firmo", "Firmo", "Firmo", "Schiavonea", "Schiavonea", 
"Schiavonea"), Data = c("1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022", 
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022", 
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022"), 
PM10 = c(13.29, 11.14, 9.08, 16.62, 12.98, 10.4, 16.2, 19.4, 
15.7, 10.82, 12.29, 9.54, 24.54, 22.88, 27.33)), class = "data.frame", row.names = c(NA, 
-15L))

我尝试使用plm::cortab，但它不计算相关性。

library(plm)
cortab(data$PM10, grouping = Stat, groupnames = c("Citta dei Ragazzi", "Rende", 
                                                  "Acri", "Firmo", "Schiavonea"))

输出应该如下所示：

	Citta dei Ragazzi	Rende	Acri
Citta dei Ragazzi	1
Rende	x	1
Acri	x	x	1

英文:

I have been trying to determine the correlation between variable in panel data. My data is in the form (with more dates, some values of PM10 are NA):

structure(list(NetC = c(&quot;Cosenza Provincia&quot;, &quot;Cosenza Provincia&quot;, 
&quot;Cosenza Provincia&quot;, &quot;Cosenza Provincia&quot;, &quot;Cosenza Provincia&quot;, 
&quot;Cosenza Provincia&quot;, &quot;Cosenza Provincia&quot;, &quot;Cosenza Provincia&quot;, 
&quot;Cosenza Provincia&quot;, &quot;Reti Private&quot;, &quot;Reti Private&quot;, &quot;Reti Private&quot;, 
&quot;Reti Private&quot;, &quot;Reti Private&quot;, &quot;Reti Private&quot;), ID = c(&quot;IT1938A&quot;, 
&quot;IT1938A&quot;, &quot;IT1938A&quot;, &quot;IT2086A&quot;, &quot;IT2086A&quot;, &quot;IT2086A&quot;, &quot;IT2110A&quot;, 
&quot;IT2110A&quot;, &quot;IT2110A&quot;, &quot;IT1766A&quot;, &quot;IT1766A&quot;, &quot;IT1766A&quot;, &quot;IT2090A&quot;, 
&quot;IT2090A&quot;, &quot;IT2090A&quot;), Stat = c(&quot;Citta dei Ragazzi&quot;, &quot;Citta dei Ragazzi&quot;, 
&quot;Citta dei Ragazzi&quot;, &quot;Rende&quot;, &quot;Rende&quot;, &quot;Rende&quot;, &quot;Acri&quot;, &quot;Acri&quot;, 
&quot;Acri&quot;, &quot;Firmo&quot;, &quot;Firmo&quot;, &quot;Firmo&quot;, &quot;Schiavonea&quot;, &quot;Schiavonea&quot;, 
&quot;Schiavonea&quot;), Data = c(&quot;1/1/2022&quot;, &quot;1/2/2022&quot;, &quot;1/3/2022&quot;, &quot;1/1/2022&quot;, 
&quot;1/2/2022&quot;, &quot;1/3/2022&quot;, &quot;1/1/2022&quot;, &quot;1/2/2022&quot;, &quot;1/3/2022&quot;, &quot;1/1/2022&quot;, 
&quot;1/2/2022&quot;, &quot;1/3/2022&quot;, &quot;1/1/2022&quot;, &quot;1/2/2022&quot;, &quot;1/3/2022&quot;), 
    PM10 = c(13.29, 11.14, 9.08, 16.62, 12.98, 10.4, 16.2, 19.4, 
    15.7, 10.82, 12.29, 9.54, 24.54, 22.88, 27.33)), class = &quot;data.frame&quot;, row.names = c(NA, 
-15L))

I have tried using plm::cortab, but it doesn't calculate the correlation.

library(plm)
cortab(data$PM10, grouping = Stat, groupnames = c(&quot;Citta dei Ragazzi&quot;, &quot;Rende&quot;, 
                                                  &quot;Acri&quot;, &quot;Firmo&quot;, &quot;Schiavonea&quot;))

The output should look like:

	Citta dei Ragazzi	Rende	Acri
Citta dei Ragazzi	1
Rende	x	1
Acri	x	x	1

答案1

得分: 0

以下是代码的翻译部分：

# 简单的相关性矩阵：
data.wider <- data %>%
  select(-ID, -NetC) %>% # 移除不必要的变量
  pivot_wider(names_from = 'Stat', values_from = 'PM10')
cor(data.wider[,-1], use = 'p')
# 需要更多行来设置相关性测试：
pw <- combn(unique(data$Stat),2) # 创建成对的组合
pw
pairwise_c <- apply(pw,2,function(i){
  tidy(cor.test(data.wider[[i[1]]],data.wider[[i[2]]]))
})
results <- cbind(data.frame(t(pw)),bind_rows(pairwise_c))
results

英文:

This has pretty much already been asked (https://stackoverflow.com/questions/62473889/how-can-i-complete-a-correlation-in-r-of-one-variable-across-its-factor-levels) but for ease I have adapted that answer here for your use:

# simple correlation matrix:
data.wider &lt;- data %&gt;% 
  select(-ID, -NetC) %&gt;% # remove unnecessary vars 
  pivot_wider(names_from = &#39;Stat&#39;, values_from = &#39;PM10&#39;)
cor(data.wider[,-1], use = &#39;p&#39;)  
# more lines required to set up correlation testing:
pw &lt;- combn(unique(data$Stat),2) # make pairwise sets
pw
pairwise_c &lt;- apply(pw,2,function(i){
  tidy(cor.test(data.wider[[i[1]]],data.wider[[i[2]]]))
})
results &lt;- cbind(data.frame(t(pw)),bind_rows(pairwise_c))
results

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

R中变量之间的相关矩阵

问题

答案1

Numbering rows within groups in a data frame, but in relation to the blocks of rows with the same value

无法在 Apple Mac M2 上安装 R conda 包。

R, brms：在函数调用内部保存模型到文件会保存整个本地环境

解压缩 R 环境对象的内容到当前工作环境

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。