有没有一种方法可以在保留索引的同时对按年分组的值进行总结?

huangapple go评论67阅读模式
英文:

Is there a way to summarize values grouped by years while keeping the index?

问题

以下是您要翻译的内容:

"I tried to summarize values of different years which are assigned to specific IDs.

I used dplyr to summarize it but did not find a way to keep the index.

My data looks something like this:

year <- c(2015, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2018, 2018, 2019, 2019)
index <- c(1,1,1,1,1,1,1,2,2,2,2,2,2)
value <- c(5,7,3, NA,9,14, 15, 8, NA, 9, 10, 6, 4)
df1 <- data.frame(year, index, value)

And that is the way i summarized the data:

sum1 <-
  df1 %>%
  group_by(year) %>%
  summarise(value = sum(value, na.rm = T))

I'd like to get an outcome like:

year1 <- c(2015, 2016, 2017, 2018, 2019)
index1 <- c(1, 1, 1, 2, 2)
value1 <- c(15, 9, 29, 27, 10)
df2 <- data.frame(year1, index1, value1)

Thanks, I really appreciate your help!"

英文:

I tried to summarize values of different years which are assigned to specific IDs.

I used dplyr to summarize it but did not find a way to keep the index.

My data looks something like this:

year <- c(2015, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2018, 2018, 2019, 2019)
index <- c(1,1,1,1,1,1,1,2,2,2,2,2,2)
value <- c(5,7,3, NA,9,14, 15, 8, NA, 9, 10, 6, 4)
df1 <- data.frame(year, index, value)

And that is the way i summarized the data:

sum1 <-
  df1 %>%
  group_by(year) %>%
  summarise(value = sum(value, na.rm = T))

I'd like to get an outcome like:

year1 <- c(2015, 2016, 2017, 2018, 2019)
index1 <- c(1, 1, 1, 2, 2)
value1 <- c(15, 9, 29, 27, 10)
df2 <- data.frame(year1, index1, value1)

Thanks, I really appreciate your help!

答案1

得分: 3

你可以使用 aggregate

aggregate(value ~ ., df1, sum)
#  year index value
#1 2015     1    15
#2 2016     1     9
#3 2017     1    29
#4 2018     2    27
#5 2019     2    10

或者使用你的代码,在 group_by 中添加 index

library(dplyr)

df1 %>%
  group_by(year, index) %>%
  summarise(value = sum(value, na.rm = T))
## A tibble: 5 × 3
## Groups:   year [5]
#   year index value
#  <dbl> <dbl> <dbl>
#1  2015     1    15
#2  2016     1     9
#3  2017     1    29
#4  2018     2    27
#5  2019     2    10
英文:

You can use aggregate:

aggregate(value ~ ., df1, sum)
#  year index value
#1 2015     1    15
#2 2016     1     9
#3 2017     1    29
#4 2018     2    27
#5 2019     2    10

Or using your code, adding index in the group_by.

library(dplyr)

df1 %&gt;%
  group_by(year, index) %&gt;%
  summarise(value = sum(value, na.rm = T))
## A tibble: 5 &#215; 3
## Groups:   year [5]
#   year index value
#  &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
#1  2015     1    15
#2  2016     1     9
#3  2017     1    29
#4  2018     2    27
#5  2019     2    10

huangapple
  • 本文由 发表于 2023年4月4日 15:59:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75926866.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定