问题

以下是您要翻译的内容：

I'm trying to get a list of links for each page I'm trying to scrape. I can get the required data from page1 but when I try and expand to other pages I am having a hard time. Can anyone point me in the right direction pls?

from requests_html import HTMLSession

s = HTMLSession()
def get_product_links(page):
  url = "https://lakesshoweringspaces.com/catalogue-product-filter/page/{page}"

  r = s.get(url)
  products = r.html.find("article.contentwrapper section.collection-wrapper-item")
  for item in products:
      res = links.append(item.find("a", first=True))
      if res:
        print(res.attrs["href"])
      else:
        print("no match")
  return links

  page1 = get_product_links(1)
  print(page1)

英文:

I'm trying to get a list of links for each page I'm trying to scrape. I can get the required data from page1 but when I try and expand to other pages I am having a hard time. Can anyone point me in the right direction pls?

from requests_html import HTMLSession

s = HTMLSession()
def get_product_links(page):
  url = &quot;https://lakesshoweringspaces.com/catalogue-product-filter/page/{page}&quot;

  r = s.get(url)
  products = r.html.find(&quot;article.contentwrapper section.collection-wrapper-item&quot;)
  for item in products:
      res = links.append(item.find(&quot;a&quot;, first=True))
      if res:
        print(res.attrs[&quot;href&quot;])
      else:
        print(&quot;no match&quot;)
  return links

  page1 = get_product_links(1)
  print(page1)

答案1

得分: 0

I seem to have got this working:

from IPython.core.interactiveshell import page
from requests_html import HTMLSession

s = HTMLSession()
def get_product_links(page):
  url = f'https://lakesshoweringspaces.com/catalogue-product-filter/page/{page}'
  links = []
  r = s.get(url)
  products = r.html.find('article.contentwrapper section')
  for item in products:
    q = links.append(item.find("a", first=True))
    if q:
        print(q.attrs["href"]).text.strip()
    else:
        print("")
  return links


test_link = 'https://lakesshoweringspaces.com/catalogue_product/alassio/'

r = s.get(test_link)

print(r.html.find('div.product-sidecontent h3', first=True).text.strip())

英文:

I seem to have got this working:

from IPython.core.interactiveshell import page
from requests_html import HTMLSession

s = HTMLSession()
def get_product_links(page):
  url = f&#39;https://lakesshoweringspaces.com/catalogue-product-filter/page/{page}&#39;
  links = []
  r = s.get(url)
  products = r.html.find(&#39;article.contentwrapper section&#39;)
  for item in products:
    q = links.append(item.find(&quot;a&quot;, first=True))
    if q:
        print(q.attrs[&quot;href&quot;]).text.strip()
    else:
        print(&quot;&quot;)
  return links


test_link = &#39;https://lakesshoweringspaces.com/catalogue_product/alassio/&#39;

r = s.get(test_link)

print(r.html.find(&#39;div.product-sidecontent h3&#39;, first=True).text.strip())

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

尝试对抓取的数据进行分页化

问题

答案1

如何使用Python中的Selenium Webdriver最佳方式登录Gmail？

如何在Python中接收格式化的数字输入。

代码挑战：体育场的警卫是否能够守卫所有大门

Python: How do I bus multiple lines of code in a function so that they can be turned off with a single # to change all to a comment?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论