purrr::map函数为什么不能正确地将一个函数映射到拆分数据框的每个部分?

huangapple go评论95阅读模式
英文:

Why is the purrr::map function not correctly mapping a function to each piece of a split dataframe?

问题

以下是您提供的代码的翻译部分:

  1. 我有以下的数据框,我们可以称之为df_all
  2. 我有以下的数据框,我们可以称之为df_alt
  3. 我有以下的函数,它查找df_alldf_alt之间的共同/交集的Points值。
  4. 我试图使用以下的map语法应用int_value函数。
  5. 这是返回的输出,这不是期望的输出。
  6. 这是期望的输出和我预期返回的内容。
  7. map函数似乎没有遵循基于Book列的隐含分组。我漏掉了什么?

请注意,我已经忽略了代码的部分,只提供了翻译的内容。如果您需要进一步的帮助或解释,请随时告诉我。

英文:

I have the following dataframe that we can call df_all

  1. structure(list(ID = c("1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  2. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385"
  3. ), Book = c("Bovada", "Bovada", "LowVig.ag", "LowVig.ag"), Home = c("Alabama Crimson Tide",
  4. "Alabama Crimson Tide", "Alabama Crimson Tide", "Alabama Crimson Tide"
  5. ), Away = c("San Diego St Aztecs", "San Diego St Aztecs", "San Diego St Aztecs",
  6. "San Diego St Aztecs"), Team = c("Alabama Crimson Tide", "San Diego St Aztecs",
  7. "Alabama Crimson Tide", "San Diego St Aztecs"), Price = c(-110,
  8. -110, -111, -101), Points = c(-7.5, 7.5, -7, 7)), row.names = c(NA,
  9. -4L), class = c("tbl_df", "tbl", "data.frame"))

and I have the following dataframe that we can call df_alt

  1. structure(list(ID = c("1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  2. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  3. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  4. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  5. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  6. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  7. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  8. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385",
  9. "1738c0c7214e7fced61c1caa479a5385", "1738c0c7214e7fced61c1caa479a5385"
  10. ), Book = c("Pinnacle", "Pinnacle", "Pinnacle", "Pinnacle", "Pinnacle",
  11. "Pinnacle", "Pinnacle", "Pinnacle", "Pinnacle", "Pinnacle", "Pinnacle",
  12. "Pinnacle", "Pinnacle", "Pinnacle", "Pinnacle", "Pinnacle", "Pinnacle",
  13. "Pinnacle"), Home = c("Alabama Crimson Tide", "Alabama Crimson Tide",
  14. "Alabama Crimson Tide", "Alabama Crimson Tide", "Alabama Crimson Tide",
  15. "Alabama Crimson Tide", "Alabama Crimson Tide", "Alabama Crimson Tide",
  16. "Alabama Crimson Tide", "Alabama Crimson Tide", "Alabama Crimson Tide",
  17. "Alabama Crimson Tide", "Alabama Crimson Tide", "Alabama Crimson Tide",
  18. "Alabama Crimson Tide", "Alabama Crimson Tide", "Alabama Crimson Tide",
  19. "Alabama Crimson Tide"), Away = c("San Diego St Aztecs", "San Diego St Aztecs",
  20. "San Diego St Aztecs", "San Diego St Aztecs", "San Diego St Aztecs",
  21. "San Diego St Aztecs", "San Diego St Aztecs", "San Diego St Aztecs",
  22. "San Diego St Aztecs", "San Diego St Aztecs", "San Diego St Aztecs",
  23. "San Diego St Aztecs", "San Diego St Aztecs", "San Diego St Aztecs",
  24. "San Diego St Aztecs", "San Diego St Aztecs", "San Diego St Aztecs",
  25. "San Diego St Aztecs"), Team = c("Alabama Crimson Tide", "Alabama Crimson Tide",
  26. "Alabama Crimson Tide", "Alabama Crimson Tide", "Alabama Crimson Tide",
  27. "Alabama Crimson Tide", "Alabama Crimson Tide", "Alabama Crimson Tide",
  28. "San Diego St Aztecs", "San Diego St Aztecs", "San Diego St Aztecs",
  29. "San Diego St Aztecs", "San Diego St Aztecs", "San Diego St Aztecs",
  30. "San Diego St Aztecs", "San Diego St Aztecs", "Alabama Crimson Tide",
  31. "San Diego St Aztecs"), Price = c(-149, -138, -126, -115, 105,
  32. 114, 122, 132, 128, 119, 110, 102, -119, -131, -142, -154, -104,
  33. -108), Points = c(-5.5, -6, -6.5, -7, -8, -8.5, -9, -9.5, 5.5,
  34. 6, 6.5, 7, 8, 8.5, 9, 9.5, -7.5, 7.5)), row.names = c(NA, -18L
  35. ), class = c("tbl_df", "tbl", "data.frame"))

I have the following function which looks for common/intersecting Points values between df_all and df_alt.

  1. int_value <- function(df){
  2. df %>%
  3. dplyr::select(c(ID, Team, Points)) %>%
  4. dplyr::intersect(df_alt %>% dplyr::select(c(ID, Team,Points))) %>%
  5. mutate(Book = 'Pinnacle')
  6. df %>% full_join(df_int)%>% left_join(df_alt %>% rename(price=Price)) %>%
  7. mutate(Price=ifelse(is.na(price),Price,price))%>%
  8. select(-price)
  9. }

I am trying to apply int_value using the following map syntax.

  1. df_all %>%
  2. group_split(ID, Book) %>%
  3. map(int_value)

This is the output that is returned which is not the desired output.

  1. [[1]]
  2. # A tibble: 8 × 7
  3. ID Book Home Away Team
  4. Price Points
  5. <chr> <chr> <chr> <chr> <chr>
  6. <dbl> <dbl>
  7. 1 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St Aztecs Alabama
  8. Crimson Tide -110 -7.5
  9. 2 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St Aztecs San
  10. Diego St Aztecs -110 7.5
  11. 3 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St Aztecs Alabama
  12. Crimson Tide -111 -7
  13. 4 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St Aztecs San
  14. Diego St Aztecs -101 7
  15. 5 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs Alabama
  16. Crimson Tide -104 -7.5
  17. 6 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs San
  18. Diego St Aztecs -108 7.5
  19. 7 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs Alabama
  20. Crimson Tide -115 -7
  21. 8 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs San
  22. Diego St Aztecs 102 7
  23. [[2]]
  24. # A tibble: 8 × 7
  25. ID Book Home Away Team
  26. Price Points
  27. <chr> <chr> <chr> <chr> <chr>
  28. <dbl> <dbl>
  29. 1 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St Aztecs Alabama
  30. Crimson Tide -110 -7.5
  31. 2 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St Aztecs San
  32. Diego St Aztecs -110 7.5
  33. 3 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St Aztecs Alabama
  34. Crimson Tide -111 -7
  35. 4 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St Aztecs San
  36. Diego St Aztecs -101 7
  37. 5 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs Alabama
  38. Crimson Tide -104 -7.5
  39. 6 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs San
  40. Diego St Aztecs -108 7.5
  41. 7 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs Alabama
  42. Crimson Tide -115 -7
  43. 8 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs San
  44. Diego St Aztecs 102 7

This is the desired output and what I expected to be returned.

  1. [[1]]
  2. # A tibble: 6 × 7
  3. ID Book Home Away Team
  4. Price Points
  5. <chr> <chr> <chr> <chr> <chr>
  6. <dbl> <dbl>
  7. 1 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St Aztecs Alabama
  8. Crimson Tide -110 -7.5
  9. 2 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St Aztecs San Diego
  10. St Aztecs -110 7.5
  11. 3 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs Alabama
  12. Crimson Tide -104 -7.5
  13. 4 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs San Diego
  14. St Aztecs -108 7.5
  15. [[2]]
  16. # A tibble: 6 × 7
  17. ID Book Home Away Team
  18. Price Points
  19. <chr> <chr> <chr> <chr> <chr>
  20. <dbl> <dbl>
  21. 1 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St Aztecs Alabama
  22. Crimson Tide -111 -7
  23. 2 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St Aztecs San
  24. Diego St Aztecs -101 7
  25. 3 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs Alabama
  26. Crimson Tide -115 -7
  27. 4 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St Aztecs San
  28. Diego St Aztecs 102 7

The map function doesn't appear to be honoring the implied group_by based on the Book column. What am I missing?

答案1

得分: 1

以下是翻译的部分:

解决方案如@stefan建议的那样。 在定义df_int并分配必要变量之后,输出是准确的。 这是更新后的函数

  1. int_value <- function(df){
  2. df_int <- df %>%
  3. dplyr::select(c(ID, Home, Away, Team, Points)) %>%
  4. dplyr::intersect(df_alt %>% dplyr::select(c(ID, Home, Away, Team,
  5. Points))) %>%
  6. mutate(Book = 'Pinnacle')
  7. df_join <- df %>% full_join(df_int)
  8. df_final <- df_join %>% left_join(df_alt %>% rename(price=Price)) %>%
  9. mutate(Price=ifelse(is.na(price),Price,price))%>%
  10. select(-price)
  11. }

这是更新后的输出

  1. [[1]]
  2. # A tibble: 4 × 7
  3. ID Book Home Away
  4. Team Price Points
  5. <chr> <chr> <chr> <chr>
  6. <chr> <dbl> <dbl>
  7. 1 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St
  8. Aztecs Alabama Crimson Tide -110 -7.5
  9. 2 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St
  10. Aztecs San Diego St Aztecs -110 7.5
  11. 3 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St
  12. Aztecs Alabama Crimson Tide -104 -7.5
  13. 4 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St
  14. Aztecs San Diego St Aztecs -108 7.5
  15. [[2]]
  16. # A tibble: 4 × 7
  17. ID Book Home Away
  18. Team Price Points
  19. <chr> <chr> <chr> <chr>
  20. <chr> <dbl> <dbl>
  21. 1 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St
  22. Aztecs Alabama Crimson Tide -111 -7
  23. 2 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St
  24. Aztecs San Diego St Aztecs -101 7
  25. 3 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St
  26. Aztecs Alabama Crimson Tide -115 -7
  27. 4 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St
  28. Aztecs San Diego St Aztecs 102 7
英文:

The solution was as @stefan had recommended. After defining df_int and assigning the necessary variables the output is accurate. Here is the updated function

  1. int_value &lt;- function(df){
  2. df_int &lt;- df %&gt;%
  3. dplyr::select(c(ID, Home, Away, Team, Points)) %&gt;%
  4. dplyr::intersect(df_alt %&gt;% dplyr::select(c(ID, Home, Away, Team,
  5. Points))) %&gt;%
  6. mutate(Book = &#39;Pinnacle&#39;)
  7. df_join &lt;- df %&gt;% full_join(df_int)
  8. df_final &lt;- df_join %&gt;% left_join(df_alt %&gt;% rename(price=Price)) %&gt;%
  9. mutate(Price=ifelse(is.na(price),Price,price))%&gt;%
  10. select(-price)
  11. }

And here is the updated output

  1. [[1]]
  2. # A tibble: 4 &#215; 7
  3. ID Book Home Away
  4. Team Price Points
  5. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  6. &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
  7. 1 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St
  8. Aztecs Alabama Crimson Tide -110 -7.5
  9. 2 1738c0c7214e7fced61c1caa479a5385 Bovada Alabama Crimson Tide San Diego St
  10. Aztecs San Diego St Aztecs -110 7.5
  11. 3 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St
  12. Aztecs Alabama Crimson Tide -104 -7.5
  13. 4 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St
  14. Aztecs San Diego St Aztecs -108 7.5
  15. [[2]]
  16. # A tibble: 4 &#215; 7
  17. ID Book Home Away
  18. Team Price Points
  19. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  20. &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
  21. 1 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St
  22. Aztecs Alabama Crimson Tide -111 -7
  23. 2 1738c0c7214e7fced61c1caa479a5385 LowVig.ag Alabama Crimson Tide San Diego St
  24. Aztecs San Diego St Aztecs -101 7
  25. 3 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St
  26. Aztecs Alabama Crimson Tide -115 -7
  27. 4 1738c0c7214e7fced61c1caa479a5385 Pinnacle Alabama Crimson Tide San Diego St
  28. Aztecs San Diego St Aztecs 102 7

huangapple
  • 本文由 发表于 2023年3月23日 11:32:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75819037.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定