r table for many columns

huangapple go评论128阅读模式
英文:

r table for many columns

问题

我的数据集如下。

  1. ID Col_01 Col_02 Col_03 Col_04 Col_05 Col_06
  2. 1 1 2 1 3 4 -9
  3. 2 1 1 2 1 2 2
  4. 3 2 4 1 1 1 1
  5. 4 3 1 3 2 -9 4
  6. 5 2 3 4 4 3 2

我想创建一个总结的数据集,其中每列(Col_01-Col_06)中的1、2、3、4和-9的数量如下所示。

  1. Values Col_01 Col_02 Col_03 Col_04 Col_05 Col_06
  2. 1 2 2 2 2 1 1
  3. 2 2 1 1 1 1 2
  4. 3 1 1 1 1 1 0
  5. 4 0 1 1 1 1 1
  6. -9 0 0 0 0 1 1

到目前为止,我尝试了以下代码

  1. df %>%
  2. select(matches(^Col_\\d+$")) %>%
  3. summarise_all(funs(table))

但我收到一个错误消息:Col_05 的大小必须为 4 或 1,而不是之前列的大小为 4。还有一堆其他警告。是否有任何建议,我如何可以为数据集中以 "Col_" 开头的所有列创建表格摘要?感谢。

英文:

My dataset is like this.

  1. ID Col_01 Col_02 Col_03 Col_04 Col_05 Col_06
  2. 1 1 2 1 3 4 -9
  3. 2 1 1 2 1 2 2
  4. 3 2 4 1 1 1 1
  5. 4 3 1 3 2 -9 4
  6. 5 2 3 4 4 3 2

I like to create a summarized dataset where the number of 1s,2s,3s,4s, -9s in each column (Col_01-Col_06) are counted like this.

  1. Values Col_01 Col_02 Col_03 Col_04 Col_05 Col_06
  2. 1 2 2 2 2 1 1
  3. 2 2 1 1 1 1 2
  4. 3 1 1 1 1 1 0
  5. 4 0 1 1 1 1 1
  6. -9 0 0 0 0 1 1

So far I tried

  1. df %>%
  2. select(matches(^Col_\\d+$")) %>%
  3. summarise_all(funs(table))

but I get an error Col_05 must be of size 4 or 1 , not 5 as earlier column had size 4. and bunch of other warnings. Any suggestions how I can create table summary for all columns starting with Col_ in my dataset is appreciated, Thanks.

答案1

得分: 2

  1. 在基本的R中,您可以执行以下操作:
  2. ```r
  3. table(stack(df1,-1))

如果您需要一个数据框架:

  1. as.data.frame(matrix(table(stack(df1,-1)))
英文:

In base R you could do

  1. table(stack(df1,-1))

If you need a dataframe:

  1. as.data.frame matrix(table(stack(df1,-1)))

答案2

得分: 1

以下是代码部分的翻译:

  1. Pivoting longer, counting, then pivoting wider is one option.
  2. library(dplyr)
  3. library(tidyr)
  4. df1 %>%
  5. pivot_longer(starts_with("Col_")) %>%
  6. count(name, value) %>%
  7. pivot_wider(names_from = name,
  8. values_from = n,
  9. values_fill = 0)

请注意,代码部分没有需要翻译的内容,所以只提供了原文的代码。如果您需要其他方面的帮助,请随时告诉我。

英文:

Pivoting longer, counting, then pivoting wider is one option.

  1. library(dplyr)
  2. library(tidyr)
  3. df1 %>%
  4. pivot_longer(starts_with("Col_")) %>%
  5. count(name, value) %>%
  6. pivot_wider(names_from = name,
  7. values_from = n,
  8. values_fill = 0)

Result:

  1. # A tibble: 5 × 7
  2. value Col_01 Col_02 Col_03 Col_04 Col_05 Col_06
  3. <int> <int> <int> <int> <int> <int> <int>
  4. 1 1 2 2 2 2 1 1
  5. 2 2 2 1 1 1 1 2
  6. 3 3 1 1 1 1 1 0
  7. 4 4 0 1 1 1 1 1
  8. 5 -9 0 0 0 0 1 1

Data:

  1. df1 <- structure(list(ID = 1:5, Col_01 = c(1L, 1L, 2L, 3L, 2L), Col_02 = c(2L,
  2. 1L, 4L, 1L, 3L), Col_03 = c(1L, 2L, 1L, 3L, 4L), Col_04 = c(3L,
  3. 1L, 1L, 2L, 4L), Col_05 = c(4L, 2L, 1L, -9L, 3L), Col_06 = c(-9L,
  4. 2L, 1L, 4L, 2L)), class = "data.frame", row.names = c(NA, -5L
  5. ))

huangapple
  • 本文由 发表于 2023年2月24日 12:19:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/75552606.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定