2023年2月10日 04:34:08go评论68阅读模式

英文:

Why am I getting this error? AttributeError: 'NoneType' object has no attribute 'attrs'

问题

以下是您要翻译的代码部分：

from requests_html import HTMLSession

s = HTMLSession()

url = 'https://lakesshoweringspaces.com/catalogue-product-filter/page/1'

r = s.get(url)

products = r.html.find('article.contentwrapper section')

for item in products:
  print(item.find('a', first=True).attrs['href'])

英文:

I am trying to scrape all of the href attributes from the page below:

Problem is I get the first link but I get an error at that point. Can anyone show me how to fix this please? I am still learning about Python.

Many thanks

from requests_html import HTMLSession

s = HTMLSession()

url = &#39;https://lakesshoweringspaces.com/catalogue-product-filter/page/1&#39;

r = s.get(url)

products = r.html.find(&#39;article.contentwrapper section&#39;)

for item in products:
  print(item.find(&#39;a&#39;, first=True).attrs[&#39;href&#39;])

答案1

得分: 0

根据Brian的评论进一步说明，您的代码假定item.find('a', first=True)将成功找到一个元素。如果没有找到，该代码将返回None，然后您正在请求None.attrs['href']（而None没有attrs属性，因此会出现错误消息）。

如果我们重写您的代码以实际检查find方法的返回值：

from requests_html import HTMLSession

s = HTMLSession()

url = "https://lakesshoweringspaces.com/catalogue-product-filter/page/1"

r = s.get(url)

products = r.html.find("article.contentwrapper section")

for item in products:
    res = item.find("a", first=True)
    if res:
        print(res.attrs["href"])
    else:
        print("no match")

然后我们会发现它无法在每个循环迭代中找到任何a元素：

https://.../catalogue_product/alassio/?
no match
https://.../catalogue_product/amare/?
no match
https://.../catalogue_product/ambient/?
no match
https://.../catalogue_product/andora/?
no match
https://.../catalogue_product/antigua/?
no match
https://.../catalogue_product/aruba/?
no match
https://.../catalogue_product/avanza/?
no match
https://.../catalogue_product/barbados/?
no match
https://.../catalogue_product/bergen-bi-fold-door/?
no match
https://.../catalogue_product/framed-bi-fold-door/?
no match
https://.../catalogue_product/semi-frameless-bi-fold-door/?
no match
https://.../catalogue_product/cannes-10mm/?

这是因为您的表达式"article.contentwrapper section"同时匹配具有类collection-wrapper-item和类compare_favorites_section的部分，后者不包含任何a元素。

如果您修改代码以更具选择性：

from requests_html import HTMLSession

s = HTMLSession()

url = "https://lakesshoweringspaces.com/catalogue-product-filter/page/1"

r = s.get(url)

products = r.html.find("article.contentwrapper section.collection-wrapper-item")

for item in products:
    res = item.find("a", first=True)
    if res:
        print(res.attrs["href"])
    else:
        print("no match")

那么您将可靠地找到链接。运行上述代码会产生以下结果：

https://.../catalogue_product/alassio/?
https://.../catalogue_product/amare/?
https://.../catalogue_product/ambient/?
https://.../catalogue_product/andora/?
https://.../catalogue_product/antigua/?
https://.../catalogue_product/aruba/?
https://.../catalogue_product/avanza/?
https://.../catalogue_product/barbados/?
https://.../catalogue_product/bergen-bi-fold-door/?
https://.../catalogue_product/framed-bi-fold-door/?
https://.../catalogue_product/semi-frameless-bi-fold-door/?
https://.../catalogue_product/cannes-10mm/?

英文:

To expand on Brian's comment, your code assumes that item.find('a', first=True) will successfully find an element. If it doesn't, that code returns None, and then you're asking for None.attrs['href'] (and None doesn't have an attrs attribute, hence the error message).

If we rewrite your code to actually check the return value of the find method:

from requests_html import HTMLSession

s = HTMLSession()

url = &quot;https://lakesshoweringspaces.com/catalogue-product-filter/page/1&quot;

r = s.get(url)

products = r.html.find(&quot;article.contentwrapper section&quot;)

for item in products:
    res = item.find(&quot;a&quot;, first=True)
    if res:
        print(res.attrs[&quot;href&quot;])
    else:
        print(&quot;no match&quot;)

Then we find that it fails to find any a elements in every other loop iteration:

https://.../catalogue_product/alassio/?
no match
https://.../catalogue_product/amare/?
no match
https://.../catalogue_product/ambient/?
no match
https://.../catalogue_product/andora/?
no match
https://.../catalogue_product/antigua/?
no match
https://.../catalogue_product/aruba/?
no match
https://.../catalogue_product/avanza/?
no match
https://.../catalogue_product/barbados/?
no match
https://.../catalogue_product/bergen-bi-fold-door/?
no match
https://.../catalogue_product/framed-bi-fold-door/?
no match
https://.../catalogue_product/semi-frameless-bi-fold-door/?
no match
https://.../catalogue_product/cannes-10mm/?
no match

And that's because your expression article.contentwrapper section is matching both sections with class collection-wrapper-item and sections with class compare_favorites_section, the latter of which contain no a elements.

If you modify your code to be more selective:

from requests_html import HTMLSession

s = HTMLSession()

url = &quot;https://lakesshoweringspaces.com/catalogue-product-filter/page/1&quot;

r = s.get(url)

products = r.html.find(&quot;article.contentwrapper section.collection-wrapper-item&quot;)

for item in products:
    res = item.find(&quot;a&quot;, first=True)
    if res:
        print(res.attrs[&quot;href&quot;])
    else:
        print(&quot;no match&quot;)

Then you will reliably find the links. Running the above produces:

https://.../catalogue_product/alassio/?
https://.../catalogue_product/amare/?
https://.../catalogue_product/ambient/?
https://.../catalogue_product/andora/?
https://.../catalogue_product/antigua/?
https://.../catalogue_product/aruba/?
https://.../catalogue_product/avanza/?
https://.../catalogue_product/barbados/?
https://.../catalogue_product/bergen-bi-fold-door/?
https://.../catalogue_product/framed-bi-fold-door/?
https://.../catalogue_product/semi-frameless-bi-fold-door/?
https://.../catalogue_product/cannes-10mm/?

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

AttributeError: ‘NoneType’对象没有属性’attrs’。

问题

答案1

为什么生成的tkinter按钮-1事件无法识别？

如何在Python中的进程类的其他方法中使用run方法的变量

使用Python分离括号

如何在Python中使用“begin”和“end”标志从另一个列表构建嵌套列表？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论