2023年2月10日 05:36:17go评论101阅读模式

英文:

Expanding Data Frame of Response Counts into Data Frame of Responses

问题

我试图将一个包含特定响应计数的数据框扩展为一个包含实际响应的数据框（而不是按计数聚合的数据框）
示例数据：

set.seed(44)
data <- data.frame(ID = LETTERS[1:5]
, Count_1 = sample(c(3:8), 5)
, Count_2 = sample(c(3:8), 5)
, Count_3 = sample(c(3:8), 5))


它看起来像这样：

ID Count_1 Count_2 Count_3
1 A 3 8 3
2 B 8 7 4
3 C 7 4 8
4 D 4 3 5
5 E 5 5 6


我想将其扩展为如下所示：

desired <- rbind(data.frame(ID = c(rep("A", 3), rep("B", 8), rep("C", 7), rep("D", 4), rep("E", 5))
, Response = c(rep("1", 3), rep("1", 8) , rep("1", 7), rep("1", 4), rep("1", 5))
)
, data.frame(ID = c(rep("A", 8), rep("B", 7), rep("C", 4), rep("D", 3), rep("E", 5))
, Response = c(rep("2", 8), rep("2", 7) , rep("2", 4), rep("2", 3), rep("2", 5))
)
, data.frame(ID = c(rep("A", 3), rep("B", 4), rep("C", 8), rep("D", 5), rep("E", 6))
, Response = c(rep("2", 3), rep("2", 4) , rep("2", 8), rep("2", 5), rep("2", 6))
)
)


它看起来像：

ID Response
1 A 1
2 A 1
3 A 1
4 B 1
5 B 1
6 B 1
7 B 1
8 B 1
9 B 1
10 B 1
11 B 1
12 C 1
13 C 1
14 C 1
15 C 1
16 C 1
17 C 1
18 C 1
19 D 1
20 D 1
21 D 1
22 D 1
23 E 1
24 E 1
25 E 1
26 E 1
27 E 1
28 A 2
29 A 2
30 A 2
31 A 2
32 A 2
33 A 2
34 A 2
35 A 2
36 B 2
37 B 2
38 B 2
39 B 2
40 B 2
41 B 2
42 B 2
43 C 2
44 C 2
45 C 2
46 C 2
47 D 2
48 D 2
49 D 2
50 E 2
51 E 2
52 E 2
53 E 2
54 E 2
55 A 2
56 A 2
57 A 2
58 B 2
59 B 2
60 B 2
61 B 2
62 C 2
63 C 2
64 C 2
65 C 2
66 C 2
67 C 2
68 C 2
69 C 2
70 D 2
71 D 2
72 D 2
73 D 2
74 D 2
75 E 2
76 E 2
77 E 2
78 E 2
79 E 2
80 E 2


<details>
<summary>英文:</summary>
I am trying to expand a data frame of counts of specific responses into a data frame of actual responses (rather than aggregated by counts)
Example data:

set.seed(44)
data <- data.frame(ID = LETTERS[1:5]
, Count_1 = sample(c(3:8), 5)
, Count_2 = sample(c(3:8), 5)
, Count_3 = sample(c(3:8), 5))


Which looks like this:

ID Count_1 Count_2 Count_3
1 A 3 8 3
2 B 8 7 4
3 C 7 4 8
4 D 4 3 5
5 E 5 5 6


I want to expand this out to look like this:

             , Response = c(rep(&quot;2&quot;, 8), rep(&quot;2&quot;, 7) , rep(&quot;2&quot;, 4), rep(&quot;2&quot;, 3), rep(&quot;2&quot;, 5)
             )
             )
             
             , data.frame(ID = c(rep(&quot;A&quot;, 3), rep(&quot;B&quot;, 4), rep(&quot;C&quot;, 8), rep(&quot;D&quot;, 5), rep(&quot;E&quot;, 6)
             )
             , Response = c(rep(&quot;2&quot;, 3), rep(&quot;2&quot;, 4) , rep(&quot;2&quot;, 8), rep(&quot;2&quot;, 5), rep(&quot;2&quot;, 6)
             )
             )
             
             )


Which looks like:


I was thinking maybe a for loop to create vectors for each cell in each column, but I&#39;m wondering if that&#39;s necessary or if there might be a more efficient way?
</details>
# 答案1
**得分**: 1
We could reshape to 'long' format with `pivot_longer` and then use `uncount`
```R
library(dplyr)
library(tidyr)
data %>%
   pivot_longer(cols = starts_with("Count"), names_prefix = 'Count_', 
     names_to = 'Response') %>%
   arrange(Response) %>%
   uncount(value)

-output

# A tibble: 80 x 2
   ID    Response
   <chr> <chr>   
 1 A     1       
 2 A     1       
 3 A     1       
 4 B     1       
 5 B     1       
 6 B     1       
 7 B     1       
 8 B     1       
 9 B     1       
10 B     1       
# … with 70 more rows

In base R, this can be done with rep/unlist/col/row

vals <- unlist(data[-1])
out <-  data.frame(ID = rep(data$ID[row(data[-1])], vals), 
      Response = rep(col(data[-1]), vals))

data

data <- structure(list(ID = c("A", "B", "C", "D", "E"), Count_1 = c(3L, 
8L, 7L, 4L, 5L), Count_2 = c(8L, 7L, 4L, 3L, 5L), Count_3 = c(3L, 
4L, 8L, 5L, 6L)), class = "data.frame", row names = c("1", "2", 
"3", "4", "5"))

英文:

We could reshape to 'long' format with pivot_longer and then use uncount

library(dplyr)
library(tidyr)
data %&gt;%
   pivot_longer(cols = starts_with(&quot;Count&quot;), names_prefix = &#39;Count_&#39;, 
     names_to = &#39;Response&#39;) %&gt;% 
   arrange(Response) %&gt;%
   uncount(value)

-output

# A tibble: 80 &#215; 2
   ID    Response
   &lt;chr&gt; &lt;chr&gt;   
 1 A     1       
 2 A     1       
 3 A     1       
 4 B     1       
 5 B     1       
 6 B     1       
 7 B     1       
 8 B     1       
 9 B     1       
10 B     1       
# … with 70 more rows

In base R, this can be done with rep/unlist/col/row

vals &lt;- unlist(data[-1])
out &lt;-  data.frame(ID = rep(data$ID[row(data[-1])], vals), 
      Response = rep(col(data[-1]), vals))

data

data &lt;- structure(list(ID = c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;, &quot;E&quot;), Count_1 = c(3L, 
8L, 7L, 4L, 5L), Count_2 = c(8L, 7L, 4L, 3L, 5L), Count_3 = c(3L, 
4L, 8L, 5L, 6L)), class = &quot;data.frame&quot;, row.names = c(&quot;1&quot;, &quot;2&quot;, 
&quot;3&quot;, &quot;4&quot;, &quot;5&quot;))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将响应计数的数据框扩展为响应的数据框

问题

data

data

如何在ggplotly中设置字体大小？

如何在R中使用scale_fill_discrete时更改颜色？

使用dplyr根据另一个数据集的范围值筛选数据集中的分组变量。

将森林图中的小数改为中线小数。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论