创建一个表格,返回多个变量每个值的计数。

huangapple go评论66阅读模式
英文:

Create a table that return counts of each value for multiple variables

问题

这是我的数据框:

# 加载库
library(data.table)
library(expss)
library(sjlabelled) # 调用函数 as_label()

# 创建数据框
a <- data.table("b1" = c(1, 2, 2, 2),
                "b2" = c(1, 2, 1, 1),
                "b3" = c(1, 1, 1, 1))

# 设置值标签
val_lab(a) = num_lab("
            1 是
            2 否    
")
a = as_label(a)

看起来像这样:

> a
    b1  b2  b3
1: 是 是 是
2: 否 否 是
3: 否 是 是
4: 否 是 是

我想创建一个数据集,返回每个值的总出现次数,应该看起来像下面这样:

  类别  b1  b2  b3
1:1   3   4
2:3   1   0

这可能与 Stata 中的 tabout 命令类似工作。此外,返回百分比会很好,如下所示:

  类别  b1   b2   b3
1:25   75   100
2:75   25   0
3:  合计 100  100  100
英文:

This is my dataframe

# Load libraries
library(data.table)
library(expss)
library(sjlabelled) # to call function as_label()

# Create dataframe
a &lt;- data.table(&quot;b1&quot; = c(1, 2, 2, 2),  
                &quot;b2&quot; = c(1, 2, 1, 1),
                &quot;b3&quot; = c(1, 1, 1, 1))

# Set value label
val_lab(a) = num_lab(&quot;
            1 Yes
            2 No    
&quot;)
a = as_label(a)

and it looks like this:

&gt; a
    b1  b2  b3
1: Yes Yes Yes
2:  No  No Yes
3:  No Yes Yes
4:  No Yes Yes

I want to create a dataset that return the total occurrence counts of each value and it should look like the following:

  Category  b1  b2  b3
1:  Yes     1   3   4
2:  No      3   1   0

This might work in a similar way as the tabout command in Stata. Also, it would be great to return percentage like this

  Category  b1   b2   b3
1:  Yes    25   75   100
2:  No     75   25   0
3:  sum    100  100  100

答案1

得分: 3

以下是翻译好的部分:

One possibility is to use the janitor package after doing some transformations with tidyverse packages:

一个可能的方法是在使用 tidyverse 包进行一些转换之后使用 janitor 包:

Output

输出

As for the percentages you can use the janitor adorn functions:

至于百分比,您可以使用 janitoradorn 函数:

Note, see the ?adorn_pct_formatting for additional formatting options.

注意,查看 ?adorn_pct_formatting 以获取其他格式选项。

Output

输出

英文:

One possibility is to use the janitor package after doing some transformations with tidyverse packages:

library(janitor)
library(dplyr)
library(tidyr)

counts &lt;- a %&gt;% 
  pivot_longer(everything(), values_to = &quot;Category&quot;) %&gt;% 
  mutate(Category = c(&quot;Yes&quot;, &quot;No&quot;)[Category]) %&gt;% 
  tabyl(Category, name)

Output

 Category b1 b2 b3
       No  3  1  0
      Yes  1  3  4

As for the percentages you can use the janitor adorn functions:

counts %&gt;% 
  adorn_percentages(denominator = &quot;col&quot;) %&gt;% 
  adorn_totals(&quot;row&quot;) %&gt;% 
  adorn_pct_formatting()

Note, see the ?adorn_pct_formatting for additional formatting options.

Output

 Category     b1     b2     b3
       No  75.0%  25.0%   0.0%
      Yes  25.0%  75.0% 100.0%
    Total 100.0% 100.0% 100.0%

答案2

得分: 2

你可以在tidyverse中使用数据透视和简单聚合,尽管我也同意@LMc的看法,janitor 包也是一个用于制表摘要的很好的选择。

library(tidyverse)

a %>%
  pivot_longer(everything()) %>%
  group_by(name, value) %>%
  summarise(n = n()) %>%
  mutate(p = n / sum(n)) %>%
  pivot_wider(id_cols = value, names_from = name, values_from = n, values_fill = 0)

# A tibble: 2 × 4
  value    b1    b2    b3
  <chr> <int> <int> <int>
1 No        3     1     0
2 Yes       1     3     4

或者使用 values_from = p 替代:

# A tibble: 2 × 4
  value    b1    b2    b3
  <chr> <dbl> <dbl> <dbl>
1 No     0.75  0.25     0
2 Yes    0.25  0.75     1
英文:

You can use pivots and simple aggregations with tidyverse, although I would also agree with @LMc that janitor is a great package for tabular summaries.

library(tidyverse)

a |&gt; 
  pivot_longer(everything()) |&gt; 
  group_by(name, value) |&gt; 
  summarise(n = n()) |&gt; 
  mutate(p = n / sum(n)) |&gt; 
  pivot_wider(id_cols = value, names_from = name, values_from = n, values_fill = 0)

# A tibble: 2 &#215; 4
  value    b1    b2    b3
  &lt;chr&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
1 No        3     1     0
2 Yes       1     3     4

Or with values_from = p instead:

# A tibble: 2 &#215; 4
  value    b1    b2    b3
  &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
1 No     0.75  0.25     0
2 Yes    0.25  0.75     1

huangapple
  • 本文由 发表于 2023年4月7日 01:19:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/75952173.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定