如何重叠R直方图

huangapple go评论59阅读模式
英文:

How to overlap R histograms

问题

以下是代码的翻译部分:

# 从[这里](https://stackoverflow.com/questions/64474714/run-svymean-on-all-variables)的代码复制而来:

library(haven)
library(survey)
library(dplyr)

nhanesDemo <- read_xpt(url("https://wwwn.cdc.gov/Nchs/Nhanes/2015-2016/DEMO_I.XPT"))

# 将变量重命名为更可读的名称
nhanesDemo$fpl <- nhanesDemo$INDFMPIR
nhanesDemo$age <- nhanesDemo$RIDAGEYR
nhanesDemo$gender <- nhanesDemo$RIAGENDR
nhanesDemo$persWeight <- nhanesDemo$WTINT2YR
nhanesDemo$psu <- nhanesDemo$SDMVPSU
nhanesDemo$strata <- nhanesDemo$SDMVSTRA

nhanesAnalysis <- nhanesDemo %>%
  mutate(LowIncome = case_when(
    INDFMIN2 < 40 ~ TRUE,
    TRUE ~ FALSE
  )) %>%
  # 选择必要的列
  select(INDFMIN2, LowIncome, persWeight, psu, strata)

# 设置设计
nhanesDesign <- svydesign(id      = ~psu,
                          strata  = ~strata,
                          weights = ~persWeight,
                          nest    = TRUE,
                          data    = nhanesAnalysis)

svyhist(~log10(INDFMIN2), design=nhanesDesign, main = '')

希望这对你有帮助。如果有其他翻译需求,请告诉我。

英文:

Reproduced from this code:

library(haven)
library(survey)
library(dplyr)

nhanesDemo &lt;- read_xpt(url(&quot;https://wwwn.cdc.gov/Nchs/Nhanes/2015-2016/DEMO_I.XPT&quot;))

# Rename variables into something more readable
nhanesDemo$fpl &lt;- nhanesDemo$INDFMPIR
nhanesDemo$age &lt;- nhanesDemo$RIDAGEYR
nhanesDemo$gender &lt;- nhanesDemo$RIAGENDR
nhanesDemo$persWeight &lt;- nhanesDemo$WTINT2YR
nhanesDemo$psu &lt;- nhanesDemo$SDMVPSU
nhanesDemo$strata &lt;- nhanesDemo$SDMVSTRA

nhanesAnalysis &lt;- nhanesDemo %&gt;%
  mutate(LowIncome = case_when(
    INDFMIN2 &lt; 40 ~ T,
    T ~ F
  )) %&gt;%
  # Select the necessary columns
  select(INDFMIN2, LowIncome, persWeight, psu, strata)

# Set up the design
nhanesDesign &lt;- svydesign(id      = ~psu,
                          strata  = ~strata,
                          weights = ~persWeight,
                          nest    = TRUE,
                          data    = nhanesAnalysis)

svyhist(~log10(INDFMIN2), design=nhanesDesign, main = &#39;&#39;)

如何重叠R直方图

How do I color the histogram by independent variable, say, LowIncome? I want to have two separate histograms, one for each value of LowIncome. Unfortunately I picked a bad example, but I want them to be see-through in case their values overlap.

答案1

得分: 3

如果您想从您的模型绘制直方图,可以从model.frame中获取数据(这就是svyhist在内部执行的操作)。要按组获取填充的直方图,您可以在ggplot内使用此数据框:

library(ggplot2)

ggplot(model.frame(nhanesDesign), aes(log10(INDFMIN2), fill = LowIncome)) +
  geom_histogram(alpha = 0.5, color = "gray60", breaks = 0:20 / 10) +
  theme_classic()

编辑

正如Thomas Lumley指出的那样,这不包括抽样权重,所以如果您想要包括这一点,可以这样做:

ggplot(model.frame(nhanesDesign), aes(log10(INDFMIN2), fill = LowIncome)) +
  geom_histogram(aes(weight = persWeight), alpha = 0.5, 
                 color = "gray60", breaks = 0:20 / 10) +
  theme_classic()

为了演示这种方法的有效性,我们可以在ggplot中使用svyhist的数据示例来复制Thomas的方法。要获得不均匀的箱尺寸(如果需要的话),我们需要两个直方图层,尽管我猜这对于大多数用例可能不是必需的。

ggplot(model.frame(dstrat), aes(enroll)) +
  geom_histogram(aes(fill = "E", weight = pw, y = after_stat(density)),
                 data = subset(model.frame(dstrat), stype == "E"),
                 breaks = 0:35 * 100,
                 position = "identity", col = "gray50") +
  geom_histogram(aes(fill = "Not E", weight = pw, y = after_stat(density)),
                 data = subset(model.frame(dstrat), stype != "E"),
                 position = "identity", col = "gray50",
                 breaks = 0:7 * 500) +
  scale_fill_manual(NULL, values = c("#00880020", "#88000020")) +
  theme_classic()
英文:

If you want to plot a histogram from your model, you can get its data from model.frame (this is what svyhist does under the hood). To get the histogram filled by group, you could use this data frame inside ggplot:

library(ggplot2)

ggplot(model.frame(nhanesDesign), aes(log10(INDFMIN2), fill = LowIncome)) +
  geom_histogram(alpha = 0.5, color = &quot;gray60&quot;, breaks = 0:20 / 10) +
  theme_classic() 

如何重叠R直方图


Edit

As Thomas Lumley points out, this does not incorporate sampling weights, so if you wanted this you could do:

ggplot(model.frame(nhanesDesign), aes(log10(INDFMIN2), fill = LowIncome)) +
  geom_histogram(aes(weight = persWeight), alpha = 0.5, 
                 color = &quot;gray60&quot;, breaks = 0:20 / 10) +
  theme_classic() 

如何重叠R直方图

To demonstrate this approach works, we can replicate Thomas's approach in ggplot using the data example from svyhist. To get the uneven bin sizes (if this is desired), we need two histogram layers, though I'm guessing this would not be required for most use-cases.

ggplot(model.frame(dstrat), aes(enroll)) +
  geom_histogram(aes(fill = &quot;E&quot;, weight = pw, y = after_stat(density)),
                 data = subset(model.frame(dstrat), stype == &quot;E&quot;),
                 breaks = 0:35 * 100,
                 position = &quot;identity&quot;, col = &quot;gray50&quot;) +
  geom_histogram(aes(fill = &quot;Not E&quot;, weight = pw, y = after_stat(density)),
                 data = subset(model.frame(dstrat), stype != &quot;E&quot;),
                 position = &quot;identity&quot;, col = &quot;gray50&quot;,
                 breaks = 0:7 * 500) +
  scale_fill_manual(NULL, values = c(&quot;#00880020&quot;, &quot;#88000020&quot;)) +
  theme_classic()

如何重叠R直方图

答案2

得分: 1

你不能只提取数据并使用 ggplot,因为这样不会使用权重,从而忽略了 svyhist 的整个用意。你可以使用 add=TRUE 参数。你确实需要正确设置 x 和 y 轴范围,以确保整个图都可见。

使用 ?svyhist 中的数据示例:

svyhist(~enroll, subset(dstrat,stype=="E"), col="#00880020", ylim=c(0,0.003), xlim=c(0,3500))
svyhist(~enroll, subset(dstrat,stype!="E"), col="#88000020", add=TRUE)

如何重叠R直方图

英文:

You can't just extract the data and use ggplot, because that won't use the weights and so misses the whole point of svyhist. You can use the add=TRUE argument, though. You do need to set the x and y axis ranges correctly to make sure the whole plot is visible

Using the data example from ?svyhist

svyhist(~enroll, subset(dstrat,stype==&quot;E&quot;), col=&quot;#00880020&quot;,ylim=c(0,0.003),xlim=c(0,3500))
svyhist(~enroll, subset(dstrat,stype!=&quot;E&quot;), col=&quot;#88000020&quot;,add=TRUE)

如何重叠R直方图

huangapple
  • 本文由 发表于 2023年2月14日 03:52:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/75440588.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定