英文:
Rvest continue navigating after submitting a form
问题
Suppose I want to use rvest
to search Google. I can do that using the code below.
url <- 'https://www.google.com/'
search_parameters <-
list('q' = 'dogs')
search_results <-
rvest::session(url) %>|
rvest::html_form() %>|
purrr::pluck(1) %>|
rvest::html_form_set(!!!search_parameters) %>|
rvest::html_form_submit()
#> Submitting with 'btnG'
search_results$status_code
#> [1] 200
However, I can't figure out how to navigate to the first link of the results because html_form_submit()
doesn't return a session
object.
search_parameters %>|
rvest::session_follow_link(1)
#> Error in `check_session()`:
#> ! `x` must be produced by session()
#> Backtrace:
#> x
#> 1. \-rvest::session_follow_link(search_parameters, 1)
#> 2. \-rvest:::check_session(x)
#> 3. \-rlang::abort("x must be produced by session()")
I know I could just create a new session for the example above, but that doesn't work if I need to log in to a site first. Is there a way to use the same session object to continue navigating?
英文:
Suppose I want to use rvest
to search Google. I can do that using the code below.
url <- 'https://www.google.com/'
search_parameters <-
list('q' = 'dogs')
search_results <-
rvest::session(url) |>
rvest::html_form() |>
purrr::pluck(1) |>
rvest::html_form_set(!!!search_parameters) |>
rvest::html_form_submit()
#> Submitting with 'btnG'
search_results$status_code
#> [1] 200
However, I can't figure out how to navigate to the first link of the results because html_form_submit()
doesn't return a session
object.
search_parameters |>
rvest::session_follow_link(1)
#> Error in `check_session()`:
#> ! `x` must be produced by session()
#> Backtrace:
#> x
#> 1. \-rvest::session_follow_link(search_parameters, 1)
#> 2. \-rvest:::check_session(x)
#> 3. \-rlang::abort("`x` must be produced by session()")
I know I could just create a new session for the example above, but that doesn't work if I need to log in to a site first. Is there a way to use the same session object to continue navigating?
答案1
得分: 1
你可能正在寻找session_submit()
函数:
url <- 'https://www.google.com/'
search_parameters <-
list('q' = 'dogs')
s <- rvest::session(url)
s <-
rvest::html_form(s) %>
purrr::pluck(1) %>
rvest::html_form_set(!!!search_parameters) %>
rvest::session_submit(s, form = _)
#> 使用 'btnG' 提交
s |>
rvest::session_follow_link(1)
#> 导航至
#> https://accounts.google.com/ServiceLogin?...
#> <session> https://accounts.google.com/v3/signin/identifier?...
#> 状态: 200
#> 类型: text/html; charset=utf-8
#> 大小: 555260
创建于 2023-06-01,使用 reprex v2.0.2
英文:
You are probably looking for session_submit()
:
url <- 'https://www.google.com/'
search_parameters <-
list('q' = 'dogs')
s <- rvest::session(url)
s <-
rvest::html_form(s) |>
purrr::pluck(1) |>
rvest::html_form_set(!!!search_parameters) |>
rvest::session_submit(s, form = _)
#> Submitting with 'btnG'
s |>
rvest::session_follow_link(1)
#> Navigating to
#> https://accounts.google.com/ServiceLogin?...
#> <session> https://accounts.google.com/v3/signin/identifier?...
#> Status: 200
#> Type: text/html; charset=utf-8
#> Size: 555260
<sup>Created on 2023-06-01 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论