问题

I want to scrape product names and prices from the website:
https://www.carrefouruae.com/mafuae/en/c/F21600000

import requests
html = requests.get('https://www.carrefouruae.com/mafuae/en/c/F21600000')
soup = BeautifulSoup(html.content, "html5lib")
soup.findAll('ul', attrs={'class':'css-1wgjvs'})

它返回一个空列表。它无法获取包含产品名称的实际页面源代码。原因是什么？我如何从该网站获取产品详情？

英文:

I want to scrape product names and prices from the website :
https://www.carrefouruae.com/mafuae/en/c/F21600000

import requests
html = requests.get(&#39;https://www.carrefouruae.com/mafuae/en/c/F21600000&#39;)
soup = BeautifulSoup(html.content, &quot;html5lib&quot;)
soup.findAll(&#39;ul&#39;,attrs={&#39;class&#39;:&#39;css-1wgjvs&#39;})

It's returning an empty list. It's unable to fetch the actual page source with the product names. What is the reason? How can I fetch the product details from the site?

答案1

得分: 2

以下是您提供的Python代码的翻译：

import requests
import json

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0',
    'appId': 'Reactweb',
    'storeId': 'mafuae',
}

def main():
    with requests.session() as req:
        req.headers.update(headers)
        params = {
            "areaCode": "Dubai Festival City - Dubai",
            "currentPage": "0",
            "depth": "3",
            "displayCurr": "AED",
            "filter": "",
            "lang": "en",
            "latitude": "25.2171003",
            "longitude": "55.3613635",
            "maxPrice": "",
            "minPrice": "",
            "needVariantsData": "true",
            "nextOffset": "",
            "pageSize": "60",
            "requireSponsProducts": "true",
            "responseWithCatTree": "true",
            "sortBy": "relevance"
        }
        r = req.get('https://www.carrefouruae.com/api/v8/categories/F21600000', params=params)
        with open('data.json', 'w', encoding='utf-8-sig') as f:
            json.dump(r.json(), f, indent=4)

if __name__ == "__main__":
    main()

希望这对您有所帮助。如果您有任何其他需要，请随时告诉我。

英文:

import requests
import json


headers = {
    &#39;User-Agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0&#39;,
    &#39;appId&#39;: &#39;Reactweb&#39;,
    &#39;storeId&#39;: &#39;mafuae&#39;,
}


def main():
    with requests.session() as req:
        req.headers.update(headers)
        params = {
            &quot;areaCode&quot;: &quot;Dubai Festival City - Dubai&quot;,
            &quot;currentPage&quot;: &quot;0&quot;,
            &quot;depth&quot;: &quot;3&quot;,
            &quot;displayCurr&quot;: &quot;AED&quot;,
            &quot;filter&quot;: &quot;&quot;,
            &quot;lang&quot;: &quot;en&quot;,
            &quot;latitude&quot;: &quot;25.2171003&quot;,
            &quot;longitude&quot;: &quot;55.3613635&quot;,
            &quot;maxPrice&quot;: &quot;&quot;,
            &quot;minPrice&quot;: &quot;&quot;,
            &quot;needVariantsData&quot;: &quot;true&quot;,
            &quot;nextOffset&quot;: &quot;&quot;,
            &quot;pageSize&quot;: &quot;60&quot;,
            &quot;requireSponsProducts&quot;: &quot;true&quot;,
            &quot;responseWithCatTree&quot;: &quot;true&quot;,
            &quot;sortBy&quot;: &quot;relevance&quot;
        }
        r = req.get(
            &#39;https://www.carrefouruae.com/api/v8/categories/F21600000&#39;, params=params)
        with open(&#39;data.json&#39;, &#39;w&#39;, encoding=&#39;utf-8-sig&#39;) as f:
            json.dump(r.json(), f, indent=4)


if __name__ == &quot;__main__&quot;:
    main()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

无法使用BeautifulSoup抓取网站信息。

问题

答案1

Dataframe通过在另一个Dataframe中查找另一个列的出现来填充一列的值。

AttributeError: 'NoneType' object has no attribute 'split' in Python

How to add legend for a scatter plot with title and customized labels and position the legend in any way user wants?

如何使用Confluent Kafka Python包消费Kafka中的最后5分钟数据？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论