在数据框中运行一系列连续列的计算。

huangapple go评论112阅读模式
英文:

Run a calculation for a range of consecutive columns in dataframe

问题

在下面的数据框中,我找到了最近两场比赛的平均每五分钟归一化的 FGM,使用以下代码:

df %>%
  arrange(game_date) %>%
  slice(tail(row_number(), 2), .by = PLAYER_NAME) %>%
  summarize(FGM = mean(FGM/MIN*5), .by = PLAYER_NAME)

如果我想在我的数据框中对一系列列应用相同的操作,而不仅限于 FGM,例如在这里,我想自动化过程以查找FGA的平均每五分钟归一化值。我的实际数据集中列更多,但我认为如果我能运行一个循环来覆盖例如4:5的范围,那将会起作用。列名应保持与初始列相同。

英文:

In the dataframe below I find the average FGM for the two most recent games normalized to 5 minutes using

df %>% 
  arrange(game_date) %>%
  slice(tail(row_number(), 2), .by = PLAYER_NAME) %>%
  summarize(FGM = mean(FGM/MIN*5), .by = PLAYER_NAME)

What if I want to apply the same to a range of columns in my dataframe rather than only. Here for example I want to automate the process to find the same for FGA. My columns are more in my real dataset but I think if I could run a loop to range for example 4:5 that would work. The column name should remain the same as the initial.

df<-structure(list(game_date = structure(c(19153, 19153, 19156, 19156, 
19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 
19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 
19156, 19156, 19156, 19156, 19156, 19156, 19156, 19159, 19159, 
19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159, 
19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159, 
19159, 19159), class = "Date"), MIN = c(12.533, 10.067, 3.933, 
32.847, 37.13, 4.72, 39.625, 34.983, 14.617, 1.317, 39.703, 42.533, 
16.75, 44.05, 26.155, 1.317, 44.417, 21.413, 1.317, 30.237, 1.317, 
14.287, 16.067, 1.317, 1.317, 4.683, 1.317, 1.317, 1.317, 1.017, 
39.215, 39.918, 41.302, 41.817, 13.05, 1.05, 38.483, 43.682, 
21.667, 44, 19.767, 40.21, 16.452, 1.05, 32.623, 17.782, 15.85, 
1.017, 1.05, 7.95, 1.05), FGM = c(1, 0, 0, 3, 7, 0, 7, 3, 1, 
0, 7, 12, 1, 5, 6, 0, 10, 0, 1, 4, 0, 4, 1, 0, 0, 0, 0, 0, 0, 
0, 6, 12, 5, 5, 2, 0, 4, 7, 0, 12, 2, 6, 1, 0, 4, 5, 1, 0, 0, 
0, 0), FGA = c(2, 2, 0, 6, 22, 0, 14, 6, 3, 0, 15, 23, 2, 18, 
8, 1, 20, 4, 1, 5, 0, 8, 2, 0, 1, 3, 1, 0, 0, 0, 8, 21, 20, 10, 
3, 1, 12, 18, 2, 23, 6, 18, 6, 0, 8, 12, 2, 0, 0, 2, 0), PLAYER_NAME = c("Grant Williams", 
"Payton Pritchard", "Andre Iguodala", "Al Horford", "Stephen Curry", 
"Nemanja Bjelica", "Klay Thompson", "Draymond Green", "Otto Porter Jr.", 
"Nik Stauskas", "Marcus Smart", "Andrew Wiggins", "Kevon Looney", 
"Jaylen Brown", "Gary Payton II", "Damion Lee", "Jayson Tatum", 
"Derrick White", "Luke Kornet", "Robert Williams III", "Juan Toscano-Anderson", 
"Jordan Poole", "Grant Williams", "Juwan Morgan", "Aaron Nesmith", 
"Payton Pritchard", "Jonathan Kuminga", "Moses Moody", "Sam Hauser", 
"Andre Iguodala", "Al Horford", "Stephen Curry", "Klay Thompson", 
"Draymond Green", "Otto Porter Jr.", "Nik Stauskas", "Marcus Smart", 
"Andrew Wiggins", "Kevon Looney", "Jaylen Brown", "Gary Payton II", 
"Jayson Tatum", "Derrick White", "Luke Kornet", "Robert Williams III", 
"Jordan Poole", "Grant Williams", "Juwan Morgan", "Aaron Nesmith", 
"Payton Pritchard", "Sam Hauser")), row.names = c(NA, -51L), class = c("tbl_df", 
"tbl", "data.frame"))

答案1

得分: 1

以下是您提供的代码的翻译:

df %>% arrange(game_date) %>% group_by(PLAYER_NAME) %>%
  slice(tail(row_number(), 2)) %>%
  mutate(across(c(FGM,FGA), ~ mean(.x/MIN*5))) %>%
  slice_tail(n=1)
# 一个数据框:49 × 5
# 分组:   PLAYER_NAME [27]
   游戏日期     时间   命中   出手 球员姓名     
   <date>     <dbl> <dbl> <dbl> <chr>         
 1 2022-06-13  1.32 0     1.90  Aaron Nesmith 
 2 2022-06-16  1.05 0     1.90  Aaron Nesmith 
 3 2022-06-13 32.8  0.611 0.967 Al Horford    
 4 2022-06-16 39.2  0.611 0.967 Al Horford    
 5 2022-06-13  3.93 0     0     Andre Iguodala
 6 2022-06-16  1.02 0     0     Andre Iguodala
 7 2022-06-13 42.5  1.11  2.38  Andrew Wiggins
 8 2022-06-16 43.7  1.11  2.38  Andrew Wiggins
 9 2022-06-13  1.32 0     3.80  Damion Lee    
10 2022-06-13 21.4  0.152 1.38  Derrick White 
# … 还有更多行
# 使用 `print(n = ...)` 来查看更多行

如果您需要进一步的帮助,请告诉我。

英文:

Could you please try the below code

df %&gt;% arrange(game_date) %&gt;% group_by(PLAYER_NAME) %&gt;% 
  slice(tail(row_number(), 2)) %&gt;%
  mutate(across(c(FGM,FGA), ~ mean(.x/MIN*5))) %&gt;% slice_tail(n=1)
# A tibble: 49 &#215; 5
# Groups:   PLAYER_NAME [27]
   game_date    MIN   FGM   FGA PLAYER_NAME   
   &lt;date&gt;     &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt;         
 1 2022-06-13  1.32 0     1.90  Aaron Nesmith 
 2 2022-06-16  1.05 0     1.90  Aaron Nesmith 
 3 2022-06-13 32.8  0.611 0.967 Al Horford    
 4 2022-06-16 39.2  0.611 0.967 Al Horford    
 5 2022-06-13  3.93 0     0     Andre Iguodala
 6 2022-06-16  1.02 0     0     Andre Iguodala
 7 2022-06-13 42.5  1.11  2.38  Andrew Wiggins
 8 2022-06-16 43.7  1.11  2.38  Andrew Wiggins
 9 2022-06-13  1.32 0     3.80  Damion Lee    
10 2022-06-13 21.4  0.152 1.38  Derrick White 
# … with 39 more rows
# ℹ Use `print(n = ...)` to see more rows

答案2

得分: 1

以下是翻译好的内容:

一个选项可能是:

df %>%
  arrange(game_date) %>%
  group_by(PLAYER_NAME) %>%
  summarise(across(c(FGM, FGA), ~ mean((./MIN * 5)[row_number() >= (n() - 1)])))
  PLAYER_NAME      FGM   FGA
1 Aaron Nesmith  0     1.90 
2 Al Horford     0.611 0.967
3 Andre Iguodala 0     0    
4 Andrew Wiggins 1.11  2.38 
5 Damion Lee     0     3.80 
6 Derrick White  0.152 1.38 
7 Draymond Green 0.513 1.03 
8 Gary Payton II 0.826 1.52 
9 Grant Williams 0.313 0.627
10 Jaylen Brown   0.966 2.33
英文:

One option could be:

df %&gt;%
 arrange(game_date) %&gt;%
 group_by(PLAYER_NAME) %&gt;%
 summarise(across(c(FGM, FGA), ~ mean((./MIN * 5)[row_number() &gt;= (n() - 1)])))

   PLAYER_NAME      FGM   FGA
   &lt;chr&gt;          &lt;dbl&gt; &lt;dbl&gt;
 1 Aaron Nesmith  0     1.90 
 2 Al Horford     0.611 0.967
 3 Andre Iguodala 0     0    
 4 Andrew Wiggins 1.11  2.38 
 5 Damion Lee     0     3.80 
 6 Derrick White  0.152 1.38 
 7 Draymond Green 0.513 1.03 
 8 Gary Payton II 0.826 1.52 
 9 Grant Williams 0.313 0.627
10 Jaylen Brown   0.966 2.33

huangapple
  • 本文由 发表于 2023年5月17日 21:18:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76272556.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定