R中变量之间的相关矩阵

huangapple go评论66阅读模式
英文:

Correlation Matrix Between Variables in R

问题

我一直在尝试确定面板数据中变量之间的相关性。我的数据如下(有更多的日期,一些PM10的值为NA):

structure(list(NetC = c("Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Reti Private", "Reti Private", "Reti Private", 
"Reti Private", "Reti Private", "Reti Private"), ID = c("IT1938A", 
"IT1938A", "IT1938A", "IT2086A", "IT2086A", "IT2086A", "IT2110A", 
"IT2110A", "IT2110A", "IT1766A", "IT1766A", "IT1766A", "IT2090A", 
"IT2090A", "IT2090A"), Stat = c("Citta dei Ragazzi", "Citta dei Ragazzi", 
"Citta dei Ragazzi", "Rende", "Rende", "Rende", "Acri", "Acri", 
"Acri", "Firmo", "Firmo", "Firmo", "Schiavonea", "Schiavonea", 
"Schiavonea"), Data = c("1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022", 
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022", 
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022"), 
PM10 = c(13.29, 11.14, 9.08, 16.62, 12.98, 10.4, 16.2, 19.4, 
15.7, 10.82, 12.29, 9.54, 24.54, 22.88, 27.33)), class = "data.frame", row.names = c(NA, 
-15L))

我尝试使用plm::cortab,但它不计算相关性。

library(plm)
cortab(data$PM10, grouping = Stat, groupnames = c("Citta dei Ragazzi", "Rende", 
                                                  "Acri", "Firmo", "Schiavonea"))

输出应该如下所示:

Citta dei Ragazzi Rende Acri
Citta dei Ragazzi 1
Rende x 1
Acri x x 1
英文:

I have been trying to determine the correlation between variable in panel data. My data is in the form (with more dates, some values of PM10 are NA):

structure(list(NetC = c("Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Reti Private", "Reti Private", "Reti Private", 
"Reti Private", "Reti Private", "Reti Private"), ID = c("IT1938A", 
"IT1938A", "IT1938A", "IT2086A", "IT2086A", "IT2086A", "IT2110A", 
"IT2110A", "IT2110A", "IT1766A", "IT1766A", "IT1766A", "IT2090A", 
"IT2090A", "IT2090A"), Stat = c("Citta dei Ragazzi", "Citta dei Ragazzi", 
"Citta dei Ragazzi", "Rende", "Rende", "Rende", "Acri", "Acri", 
"Acri", "Firmo", "Firmo", "Firmo", "Schiavonea", "Schiavonea", 
"Schiavonea"), Data = c("1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022", 
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022", 
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022"), 
    PM10 = c(13.29, 11.14, 9.08, 16.62, 12.98, 10.4, 16.2, 19.4, 
    15.7, 10.82, 12.29, 9.54, 24.54, 22.88, 27.33)), class = "data.frame", row.names = c(NA, 
-15L))

I have tried using plm::cortab, but it doesn't calculate the correlation.

library(plm)
cortab(data$PM10, grouping = Stat, groupnames = c("Citta dei Ragazzi", "Rende", 
                                                  "Acri", "Firmo", "Schiavonea"))

The output should look like:

Citta dei Ragazzi Rende Acri
Citta dei Ragazzi 1
Rende x 1
Acri x x 1

答案1

得分: 0

以下是代码的翻译部分:

# 简单的相关性矩阵:
data.wider <- data %>%
  select(-ID, -NetC) %>% # 移除不必要的变量
  pivot_wider(names_from = 'Stat', values_from = 'PM10')

cor(data.wider[,-1], use = 'p')

# 需要更多行来设置相关性测试:
pw <- combn(unique(data$Stat),2) # 创建成对的组合
pw

pairwise_c <- apply(pw,2,function(i){
  tidy(cor.test(data.wider[[i[1]]],data.wider[[i[2]]]))
})

results <- cbind(data.frame(t(pw)),bind_rows(pairwise_c))

results
英文:

This has pretty much already been asked (https://stackoverflow.com/questions/62473889/how-can-i-complete-a-correlation-in-r-of-one-variable-across-its-factor-levels) but for ease I have adapted that answer here for your use:

# simple correlation matrix:
data.wider &lt;- data %&gt;% 
  select(-ID, -NetC) %&gt;% # remove unnecessary vars 
  pivot_wider(names_from = &#39;Stat&#39;, values_from = &#39;PM10&#39;)

cor(data.wider[,-1], use = &#39;p&#39;)  

# more lines required to set up correlation testing:
pw &lt;- combn(unique(data$Stat),2) # make pairwise sets
pw

pairwise_c &lt;- apply(pw,2,function(i){
  tidy(cor.test(data.wider[[i[1]]],data.wider[[i[2]]]))
})

results &lt;- cbind(data.frame(t(pw)),bind_rows(pairwise_c))

results

huangapple
  • 本文由 发表于 2023年2月6日 20:20:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/75361247.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定