英文:
How to get the highest, lowest, and average values for each year in one table
问题
这是我正在使用的数据集:
full_trains <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-02-26/full_trains.csv")
在这个数据集中,包含了从2015年到2018年的数据。我试图获得一个更短的表格,其中只包括每年的平均旅行时间的最高值、最低值和平均值。
从数据集中选择的特定变量是:年份(year)、平均旅行时间(journey_time_avg)。
我知道如何分别使用min、max和mean函数,但我不知道如何将它们合并到一个表格中。
英文:
This is the dataset that I'm using:
full_trains <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-02-26/full_trains.csv")
In this dataset, there is data from 2015 to 2018. I am trying to get a shorter table that only has the highest, lowest, and average of average journey time for each year.
Specific variables from dataset would be: year, journey_time_avg
I know how to use the min, max, mean functions separately but I don't know how to bring it all together in one table.
答案1
得分: 0
通过使用dplyr
包,您可以通过以下方式实现:
- 按"year"分组您的数据
- 使用
summarise()
返回每个组的最小值、最大值和平均值
summary_data <- full_trains %>%
group_by(year) %>%
summarise(min_ave_time = min(journey_time_avg),
max_ave_time = max(journey_time_avg),
mean_ave_time = mean(journey_time_avg)) %>%
ungroup()
summary_data
# A tibble: 4 × 4
year min_ave_time max_ave_time mean_ave_time
<dbl> <dbl> <dbl> <dbl>
1 2015 46.8 341. 163.
2 2016 46.5 344. 163.
3 2017 46.2 454. 165.
4 2018 46.0 481 170.
请注意上面显示的结果中的值被截断,实际的"summary_data"对象具有完整且正确的值。
英文:
One way you can achieve this using the dplyr
package. The workflow is:
- Group your data by "year"
- Use
summarise()
to return min, max, and mean values for each group
summary_data <- full_trains %>%
group_by(year) %>%
summarise(min_ave_time = min(journey_time_avg),
max_ave_time = max(journey_time_avg),
mean_ave_time = mean(journey_time_avg)) %>%
ungroup()
summary_data
# A tibble: 4 × 4
year min_ave_time max_ave_time mean_ave_time
<dbl> <dbl> <dbl> <dbl>
1 2015 46.8 341. 163.
2 2016 46.5 344. 163.
3 2017 46.2 454. 165.
4 2018 46.0 481 170.
Please note the values in result shown above are truncated, the actual "summary_data" object has the full and correct values.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论