找到基于日期列和其他多个列的列的平均值。

huangapple go评论72阅读模式
英文:

Find the average value of a column based on date column and more than one other columns

问题

I have the dataframe below and I want to calculate for every PLAYER_NAME the average FGM per 5 MINS for the last 2 game_date.

以下是要翻译的内容:

我有下面的数据框,我想计算每个 PLAYER_NAME 在最近的2个 game_date 中每5分钟的平均 FGM

英文:

I have the dataframe below and I want to calculate for every PLAYER_NAME the average FGM per 5 MINS for the last 2 game_date.

df<-structure(list(game_date = structure(c(19153, 19153, 19153, 19153, 
19153, 19153, 19153, 19153, 19153, 19153, 19153, 19153, 19153, 
19153, 19153, 19153, 19153, 19156, 19156, 19156, 19156, 19156, 
19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 
19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 
19156, 19156, 19156, 19156, 19159, 19159, 19159, 19159, 19159, 
19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159, 
19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159), class = "Date"), 
    MIN = c(28.083, 40.683, 9, 40.823, 32.96, 14.5, 39.95, 43.427, 
    28.16, 39.7, 9.9, 42.667, 35.55, 31.45, 20.547, 12.533, 10.067, 
    3.933, 32.847, 37.13, 4.72, 39.625, 34.983, 14.617, 1.317, 
    39.703, 42.533, 16.75, 44.05, 26.155, 1.317, 44.417, 21.413, 
    1.317, 30.237, 1.317, 14.287, 16.067, 1.317, 1.317, 4.683, 
    1.317, 1.317, 1.317, 1.017, 39.215, 39.918, 41.302, 41.817, 
    13.05, 1.05, 38.483, 43.682, 21.667, 44, 19.767, 40.21, 16.452, 
    1.05, 32.623, 17.782, 15.85, 1.017, 1.05, 7.95, 1.05), FGM = c(2, 
    14, 0, 7, 1, 0, 7, 7, 3, 9, 2, 8, 4, 3, 6, 1, 0, 0, 3, 7, 
    0, 7, 3, 1, 0, 7, 12, 1, 5, 6, 0, 10, 0, 1, 4, 0, 4, 1, 0, 
    0, 0, 0, 0, 0, 0, 6, 12, 5, 5, 2, 0, 4, 7, 0, 12, 2, 6, 1, 
    0, 4, 5, 1, 0, 0, 0, 0), PLAYER_NAME = c("Al Horford", "Stephen Curry", 
    "Nemanja Bjelica", "Klay Thompson", "Draymond Green", "Otto Porter Jr.", 
    "Marcus Smart", "Andrew Wiggins", "Kevon Looney", "Jaylen Brown", 
    "Gary Payton II", "Jayson Tatum", "Derrick White", "Robert Williams III", 
    "Jordan Poole", "Grant Williams", "Payton Pritchard", "Andre Iguodala", 
    "Al Horford", "Stephen Curry", "Nemanja Bjelica", "Klay Thompson", 
    "Draymond Green", "Otto Porter Jr.", "Nik Stauskas", "Marcus Smart", 
    "Andrew Wiggins", "Kevon Looney", "Jaylen Brown", "Gary Payton II", 
    "Damion Lee", "Jayson Tatum", "Derrick White", "Luke Kornet", 
    "Robert Williams III", "Juan Toscano-Anderson", "Jordan Poole", 
    "Grant Williams", "Juwan Morgan", "Aaron Nesmith", "Payton Pritchard", 
    "Jonathan Kuminga", "Moses Moody", "Sam Hauser", "Andre Iguodala", 
    "Al Horford", "Stephen Curry", "Klay Thompson", "Draymond Green", 
    "Otto Porter Jr.", "Nik Stauskas", "Marcus Smart", "Andrew Wiggins", 
    "Kevon Looney", "Jaylen Brown", "Gary Payton II", "Jayson Tatum", 
    "Derrick White", "Luke Kornet", "Robert Williams III", "Jordan Poole", 
    "Grant Williams", "Juwan Morgan", "Aaron Nesmith", "Payton Pritchard", 
    "Sam Hauser")), row.names = c(NA, -66L), class = c("tbl_df", 
"tbl", "data.frame"))

答案1

得分: 1

你可以使用 dplyr 包和 summarise 函数来按组 group_by 计算某些内容。
如果你只需要最近两天的数据,你可以事先运行 filter,在其中指定所需的时间范围。

library(dplyr)

df %>%
  mutate(time_slot_5_min = cut(MIN, seq(0, 60, 5))) %>%
  dplyr::group_by(PLAYER_NAME, time_slot_5_min) %>%
  summarise(FGM = mean(FGM))

(Note: I have provided the translation of the code portion as requested.)

英文:

You can use the dplyr package and summarise function to calculate something per group group_by.
If you need only last two days, you can previously run filter where you indicate the preferred time range.

library(dplyr)

df %>%
  mutate(time_slot_5_min = cut(MIN, seq(0, 60, 5))) %>%
  dplyr::group_by( PLAYER_NAME, time_slot_5_min) %>%
  summarise( FGM = mean(FGM))

答案2

得分: 1

以下是翻译好的部分:

您可以找到最近两场比赛的平均FGM,并将其归一化为5分钟,使用dplyrarrangeslicesummarize

df %>%
  arrange(game_date) %>%
  slice(tail(row_number(), 2), .by = PLAYER_NAME) %>%
  summarize(avg_fgm = mean(FGM/MIN*5), .by = PLAYER_NAME)

输出:

   PLAYER_NAME   avg_fgm
   <chr>           <dbl>
 1 Al Horford      0.611
 2 Stephen Curry   1.22 
 3 Nemanja Bjelica 0    
 4 Klay Thompson   0.744
 5 Draymond Green  0.513
 6 Otto Porter Jr. 0.554
 7 Marcus Smart    0.701
 8 Andrew Wiggins  1.11 
 9 Kevon Looney    0.149
10 Jaylen Brown    0.966
# ℹ 17 more rows

希望这对您有所帮助!

英文:

You can find the average FGM for the two most recent games normalized to 5 minutes using dplyr's arrange, slice, and summarize

df %&gt;% 
  arrange(game_date) %&gt;%
  slice(tail(row_number(), 2), .by = PLAYER_NAME) %&gt;%
  summarize(avg_fgm = mean(FGM/MIN*5), .by = PLAYER_NAME)

Output:

   PLAYER_NAME   avg_fgm
   &lt;chr&gt;           &lt;dbl&gt;
 1 Al Horford      0.611
 2 Stephen Curry   1.22 
 3 Nemanja Bjelica 0    
 4 Klay Thompson   0.744
 5 Draymond Green  0.513
 6 Otto Porter Jr. 0.554
 7 Marcus Smart    0.701
 8 Andrew Wiggins  1.11 
 9 Kevon Looney    0.149
10 Jaylen Brown    0.966
# ℹ 17 more rows

答案3

得分: 1

以下是代码部分的翻译:

library(dplyr)

df <- df %>%
  group_by(PLAYER_NAME) %>%
  arrange(desc(game_date)) %>%
  filter(row_number() <= 2) %>%
  summarise(avg_FGM_5MINS = mean(FGM / (MIN / 5)))

输出的部分不需要翻译。

英文:
library(dplyr)

df &lt;- df %&gt;%
  group_by(PLAYER_NAME) %&gt;%
  arrange(desc(game_date)) %&gt;%
  filter(row_number() &lt;= 2) %&gt;%
  summarise(avg_FGM_5MINS = mean(FGM / (MIN / 5)))

Output

PLAYER_NAME    avg_FGM_5MINS
   &lt;chr&gt;                  &lt;dbl&gt;
 1 Aaron Nesmith          0    
 2 Al Horford             0.611
 3 Andre Iguodala         0    
 4 Andrew Wiggins         1.11 
 5 Damion Lee             0    
 6 Derrick White          0.152
 7 Draymond Green         0.513
 8 Gary Payton II         0.826
 9 Grant Williams         0.313
10 Jaylen Brown           0.966
# ℹ 17 more rows
# ℹ Use `print(n = ...)` to see more rows

huangapple
  • 本文由 发表于 2023年5月17日 20:33:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76272149.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定