2023年3月7日 18:43:12go评论103阅读模式

英文:

Transform column data to row data R

问题

我有以下格式的数据：

# 可再现的示例
order &lt;- c(4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,  4,  5,  6, 7 ,8 ,9 )
values &lt;- c(100,  74 , 70 , 88, 104 ,177  ,88, 189 , 75 , 58, 105, 171 , 29,  60 , 71 , 37 , 93,  99, 206 , 74 , 82 , 69 , 67, 102, 161 , 60 , 92 , 62 ,104, 34, 108,  53 , 50  ,80 , 70 , 77 , 76, 105 ,115 , 78)
journey_id &lt;- c(1, 1, 1 ,1 ,1, 1, 1, 1, 1 ,1, 1, 1, 1, 1, 1 ,1 ,1 ,2, 2, 2 ,2 ,2 ,2, 2, 2, 2, 2, 2, 2, 2 ,2, 2, 2 ,2, 3 ,3 ,3 ,3 ,3 ,3)
df &lt;- data.frame(order, values, journey_id)

其中 order 是沿途的一个站点，values 是观察到的该站点的值。我希望将其转换为基于旅程的数据，其中每一行应该是单个旅程的观察，列来自 order 的值。并不是所有旅程都一定对所有站点有观察值。

输出应该如下所示：

## 输出## 
#         1 ,   2,   3,    4,    5,    6,    7,   8,    9,   10,  ..., 20
#journey1 100,  74,  70,   88,   104,  177,  88,  189,  75,  58,  ..., 93
#journey2 99,   206, 74,   82,   69,   67,   102, 161,  60,  92,  ..., 80
#journey3 70,   77,  76,   105,  115,  78,   NA,  NA,   NA,  NA,  ..., NA

我的数据相当大，如果可能的话，我更愿意不要在数据框的行上循环，而是使用矢量化的解决方案。

与站点相关联的值并不在所有情况下对于单个旅程是唯一的。

英文:

I have data on the following format:

# Reproducible example
order &lt;- c(4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,  4,  5,  6, 7 ,8 ,9 )
values &lt;- c(100,  74 , 70 , 88, 104 ,177  ,88, 189 , 75 , 58, 105, 171 , 29,  60 , 71 , 37 , 93,  99, 206 , 74 , 82 , 69 , 67, 102, 161 , 60 , 92 , 62 ,104, 34, 108,  53 , 50  ,80 , 70 , 77 , 76, 105 ,115 , 78)
journey_id &lt;- c(1, 1, 1 ,1 ,1, 1, 1, 1, 1 ,1, 1, 1, 1, 1, 1 ,1 ,1 ,2, 2, 2 ,2 ,2 ,2, 2, 2, 2, 2, 2, 2, 2 ,2, 2, 2 ,2, 3 ,3 ,3 ,3 ,3 ,3)
df &lt;- data.frame(order, values, journey_id)

Where order refers to a stop along a route, and values are observed values of that stop. I would transform this to journey based data, where each row should be an observation of a single journey, where the columns are the order and the values are taken from values. All journeys do not necessarily have an observed value for all the stops.

The output should look like this:

## OUTPUT## 
#         1 ,   2,   3,    4,    5,    6,    7,   8,    9,   10,  ..., 20
#journey1 100,  74,  70,   88,   104,  177,  88,  189,  75,  58,  ..., 93
#journey2 99,   206, 74,   82,   69,   67,   102, 161,  60,  92,  ..., 80
#journey3 70,   77,  76,   105,  115,  78,   NA,  NA,   NA,  NA,  ..., NA

My data is quite large, so if possible I'd prefer to not loop over the rows in the data frame, but rather use a vectorized solution.

The value associated with a stop is not in all cases unique for a single journey.

答案1

得分: 2

使用 pivot_wider 函数：

library(tidyr)
library(dplyr)
df %>%
  pivot_wider(names_from = "order", values_from = "values")

这段代码的作用是使用 pivot_wider 函数将数据从长格式变为宽格式。

英文:

With pivot_wider:

library(tidyr)
library(dplyr)
df %&gt;% 
  pivot_wider(names_from = &quot;order&quot;, values_from = &quot;values&quot;)
  journey_id   4   5  6   7   8   9  10  11 12 13  14  15 16  17 18 19 20
1          1 100  74 70  88 104 177  88 189 75 58 105 171 29  60 71 37 93
2          2  99 206 74  82  69  67 102 161 60 92  62 104 34 108 53 50 80
3          3  70  77 76 105 115  78  NA  NA NA NA  NA  NA NA  NA NA NA NA

答案2

得分: 0

使用xtabs的一种基本方法：

xtabs(values ~ journey_id + order, df)
#          order
#journey_id   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20
#         1 100  74  70  88 104 177  88 189  75  58 105 171  29  60  71  37  93
#         2  99 206  74  82  69  67 102 161  60  92  62 104  34 108  53  50  80
#         3  70  77  76 105 115  78   0   0   0   0   0   0   0   0   0   0   0

另一种方法使用reshape：

reshape(df, direction = "wide", idvar = "journey_id", timevar = "order")
#   journey_id values.4 values.5 values.6 values.7 values.8 values.9 values.10
#1           1      100       74       70       88      104      177        88
#18          2       99      206       74       82       69       67       102
#35          3       70       77       76      105      115       78        NA
#   values.11 values.12 values.13 values.14 values.15 values.16 values.17
#1        189        75        58       105       171        29        60
#18       161        60        92        62       104        34       108
#35        NA        NA        NA        NA        NA        NA        NA
#   values.18 values.19 values.20
#1         71        37        93
#18        53        50        80
#35        NA        NA        NA

英文:

A base possibility using xtabs:

xtabs(values ~ journey_id + order, df)
#          order
#journey_id   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20
#         1 100  74  70  88 104 177  88 189  75  58 105 171  29  60  71  37  93
#         2  99 206  74  82  69  67 102 161  60  92  62 104  34 108  53  50  80
#         3  70  77  76 105 115  78   0   0   0   0   0   0   0   0   0   0   0

Another using reshape:

reshape(df, direction = &quot;wide&quot;, idvar = &quot;journey_id&quot;, timevar = &quot;order&quot;)
#   journey_id values.4 values.5 values.6 values.7 values.8 values.9 values.10
#1           1      100       74       70       88      104      177        88
#18          2       99      206       74       82       69       67       102
#35          3       70       77       76      105      115       78        NA
#   values.11 values.12 values.13 values.14 values.15 values.16 values.17
#1        189        75        58       105       171        29        60
#18       161        60        92        62       104        34       108
#35        NA        NA        NA        NA        NA        NA        NA
#   values.18 values.19 values.20
#1         71        37        93
#18        53        50        80
#35        NA        NA        NA

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将列数据转换为行数据 R

问题

答案1

答案2

如何在pandas中处理Excel中合并单元格的标题？

使用 streamlit.write(df) 时文本被截断。

改变表格中地层的顺序。

如何根据特定值对数据框中的数据进行分组？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。