英文:
Run a calculation for a range of consecutive columns in dataframe
问题
在下面的数据框中,我找到了最近两场比赛的平均每五分钟归一化的 FGM,使用以下代码:
df %>%
arrange(game_date) %>%
slice(tail(row_number(), 2), .by = PLAYER_NAME) %>%
summarize(FGM = mean(FGM/MIN*5), .by = PLAYER_NAME)
如果我想在我的数据框中对一系列列应用相同的操作,而不仅限于 FGM,例如在这里,我想自动化过程以查找FGA
的平均每五分钟归一化值。我的实际数据集中列更多,但我认为如果我能运行一个循环来覆盖例如4:5
的范围,那将会起作用。列名应保持与初始列相同。
英文:
In the dataframe below I find the average FGM for the two most recent games normalized to 5 minutes using
df %>%
arrange(game_date) %>%
slice(tail(row_number(), 2), .by = PLAYER_NAME) %>%
summarize(FGM = mean(FGM/MIN*5), .by = PLAYER_NAME)
What if I want to apply the same to a range of columns in my dataframe rather than only. Here for example I want to automate the process to find the same for FGA
. My columns are more in my real dataset but I think if I could run a loop to range for example 4:5
that would work. The column name should remain the same as the initial.
df<-structure(list(game_date = structure(c(19153, 19153, 19156, 19156,
19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156,
19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156, 19156,
19156, 19156, 19156, 19156, 19156, 19156, 19156, 19159, 19159,
19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159,
19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159, 19159,
19159, 19159), class = "Date"), MIN = c(12.533, 10.067, 3.933,
32.847, 37.13, 4.72, 39.625, 34.983, 14.617, 1.317, 39.703, 42.533,
16.75, 44.05, 26.155, 1.317, 44.417, 21.413, 1.317, 30.237, 1.317,
14.287, 16.067, 1.317, 1.317, 4.683, 1.317, 1.317, 1.317, 1.017,
39.215, 39.918, 41.302, 41.817, 13.05, 1.05, 38.483, 43.682,
21.667, 44, 19.767, 40.21, 16.452, 1.05, 32.623, 17.782, 15.85,
1.017, 1.05, 7.95, 1.05), FGM = c(1, 0, 0, 3, 7, 0, 7, 3, 1,
0, 7, 12, 1, 5, 6, 0, 10, 0, 1, 4, 0, 4, 1, 0, 0, 0, 0, 0, 0,
0, 6, 12, 5, 5, 2, 0, 4, 7, 0, 12, 2, 6, 1, 0, 4, 5, 1, 0, 0,
0, 0), FGA = c(2, 2, 0, 6, 22, 0, 14, 6, 3, 0, 15, 23, 2, 18,
8, 1, 20, 4, 1, 5, 0, 8, 2, 0, 1, 3, 1, 0, 0, 0, 8, 21, 20, 10,
3, 1, 12, 18, 2, 23, 6, 18, 6, 0, 8, 12, 2, 0, 0, 2, 0), PLAYER_NAME = c("Grant Williams",
"Payton Pritchard", "Andre Iguodala", "Al Horford", "Stephen Curry",
"Nemanja Bjelica", "Klay Thompson", "Draymond Green", "Otto Porter Jr.",
"Nik Stauskas", "Marcus Smart", "Andrew Wiggins", "Kevon Looney",
"Jaylen Brown", "Gary Payton II", "Damion Lee", "Jayson Tatum",
"Derrick White", "Luke Kornet", "Robert Williams III", "Juan Toscano-Anderson",
"Jordan Poole", "Grant Williams", "Juwan Morgan", "Aaron Nesmith",
"Payton Pritchard", "Jonathan Kuminga", "Moses Moody", "Sam Hauser",
"Andre Iguodala", "Al Horford", "Stephen Curry", "Klay Thompson",
"Draymond Green", "Otto Porter Jr.", "Nik Stauskas", "Marcus Smart",
"Andrew Wiggins", "Kevon Looney", "Jaylen Brown", "Gary Payton II",
"Jayson Tatum", "Derrick White", "Luke Kornet", "Robert Williams III",
"Jordan Poole", "Grant Williams", "Juwan Morgan", "Aaron Nesmith",
"Payton Pritchard", "Sam Hauser")), row.names = c(NA, -51L), class = c("tbl_df",
"tbl", "data.frame"))
答案1
得分: 1
以下是您提供的代码的翻译:
df %>% arrange(game_date) %>% group_by(PLAYER_NAME) %>%
slice(tail(row_number(), 2)) %>%
mutate(across(c(FGM,FGA), ~ mean(.x/MIN*5))) %>%
slice_tail(n=1)
# 一个数据框:49 × 5
# 分组: PLAYER_NAME [27]
游戏日期 时间 命中 出手 球员姓名
<date> <dbl> <dbl> <dbl> <chr>
1 2022-06-13 1.32 0 1.90 Aaron Nesmith
2 2022-06-16 1.05 0 1.90 Aaron Nesmith
3 2022-06-13 32.8 0.611 0.967 Al Horford
4 2022-06-16 39.2 0.611 0.967 Al Horford
5 2022-06-13 3.93 0 0 Andre Iguodala
6 2022-06-16 1.02 0 0 Andre Iguodala
7 2022-06-13 42.5 1.11 2.38 Andrew Wiggins
8 2022-06-16 43.7 1.11 2.38 Andrew Wiggins
9 2022-06-13 1.32 0 3.80 Damion Lee
10 2022-06-13 21.4 0.152 1.38 Derrick White
# … 还有更多行
# 使用 `print(n = ...)` 来查看更多行
如果您需要进一步的帮助,请告诉我。
英文:
Could you please try the below code
df %>% arrange(game_date) %>% group_by(PLAYER_NAME) %>%
slice(tail(row_number(), 2)) %>%
mutate(across(c(FGM,FGA), ~ mean(.x/MIN*5))) %>% slice_tail(n=1)
# A tibble: 49 × 5
# Groups: PLAYER_NAME [27]
game_date MIN FGM FGA PLAYER_NAME
<date> <dbl> <dbl> <dbl> <chr>
1 2022-06-13 1.32 0 1.90 Aaron Nesmith
2 2022-06-16 1.05 0 1.90 Aaron Nesmith
3 2022-06-13 32.8 0.611 0.967 Al Horford
4 2022-06-16 39.2 0.611 0.967 Al Horford
5 2022-06-13 3.93 0 0 Andre Iguodala
6 2022-06-16 1.02 0 0 Andre Iguodala
7 2022-06-13 42.5 1.11 2.38 Andrew Wiggins
8 2022-06-16 43.7 1.11 2.38 Andrew Wiggins
9 2022-06-13 1.32 0 3.80 Damion Lee
10 2022-06-13 21.4 0.152 1.38 Derrick White
# … with 39 more rows
# ℹ Use `print(n = ...)` to see more rows
答案2
得分: 1
以下是翻译好的内容:
一个选项可能是:
df %>%
arrange(game_date) %>%
group_by(PLAYER_NAME) %>%
summarise(across(c(FGM, FGA), ~ mean((./MIN * 5)[row_number() >= (n() - 1)])))
PLAYER_NAME FGM FGA
1 Aaron Nesmith 0 1.90
2 Al Horford 0.611 0.967
3 Andre Iguodala 0 0
4 Andrew Wiggins 1.11 2.38
5 Damion Lee 0 3.80
6 Derrick White 0.152 1.38
7 Draymond Green 0.513 1.03
8 Gary Payton II 0.826 1.52
9 Grant Williams 0.313 0.627
10 Jaylen Brown 0.966 2.33
英文:
One option could be:
df %>%
arrange(game_date) %>%
group_by(PLAYER_NAME) %>%
summarise(across(c(FGM, FGA), ~ mean((./MIN * 5)[row_number() >= (n() - 1)])))
PLAYER_NAME FGM FGA
<chr> <dbl> <dbl>
1 Aaron Nesmith 0 1.90
2 Al Horford 0.611 0.967
3 Andre Iguodala 0 0
4 Andrew Wiggins 1.11 2.38
5 Damion Lee 0 3.80
6 Derrick White 0.152 1.38
7 Draymond Green 0.513 1.03
8 Gary Payton II 0.826 1.52
9 Grant Williams 0.313 0.627
10 Jaylen Brown 0.966 2.33
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论