英文:
Scrapy selector doesn't "see" an element that is present on the webpage
问题
我想解析以下网页:
https://mafiaworldtour.com/tournaments/2653
我需要找到以下元素:
```//html/body/div[1]/div/section[2]/div/div/div/div[1]/div[1]/div/div[2]/div/div[1]/div[2]/span/text()```
当我在网页上通过检查查找它时,它明显存在,但
```city = response.xpath('//html/body/div[1]/div/section[2]/div/div/div/div[1]/div[1]/div/div[2]/div/div[1]/div[2]/span/text()').extract_first()``` 返回 None。
这是为什么呢?
我期望通过xpath获得比赛的城市 `Хайфа, Израиль`。
英文:
I want to parse the following webpage:
https://mafiaworldtour.com/tournaments/2653
And I need to find the following element:
//html/body/div[1]/div/section[2]/div/div/div/div[1]/div[1]/div/div[2]/div/div[1]/div[2]/span/text()
When I search it on the webpage via inspect, it is clearly present, but
city = response.xpath('//html/body/div[1]/div/section[2]/div/div/div/div[1]/div[1]/div/div[2]/div/div[1]/div[2]/span/text()').extract_first()
returns None.
What is the reason for this?
I expect to get the city Хайфа, Израиль
of the tournament via xpath.
答案1
得分: 0
使用我的项目retrieveCssOrXpathSelectorFromTextOrNode来获取完整的[tag:xpath]查询:
x('Хайфа, Израиль');
//body/div[@class="site-wrapper"]/div[@class="main"][@role="main"]/section[@class="page-content"]/div[@class="container"]/div[@class="tabs"]/div[@class="tab-content"]/div[@class="tab-pane fade in active "][@id="general"]/div[@class="row"]/div[@class="col-md-12"]/div[@class="table-responsive"]/div[@class="responsive-info-table"]/div[@class="row with-top-border"]/div[@class="col-md-6"]/span[@class="small_content"]
总是比使用相对路径的chrome dev tools
自动生成的XPath
查询更好:
//html/body/div[1]/div/section[2]/div/div......
但是你可以删除无用的部分,应该是这样的:
(从chrome dev tools
或firefox
控制台):
$x('//span[@class="small_content"]')[0].innerText
或者在你的情况下:
response.xpath('//span[@class="small_content"]/text()').extract_first()
输出:
" Хайфа, Израиль"
英文:
Using my own project retrieveCssOrXpathSelectorFromTextOrNode to fetch the full [tag:xpath] query:
x('Хайфа, Израиль');
//body/div[@class="site-wrapper"]/div[@class="main"][@role="main"]/section[@class="page-content"]/div[@class="container"]/div[@class="tabs"]/div[@class="tab-content"]/div[@class="tab-pane fade in active "][@id="general"]/div[@class="row"]/div[@class="col-md-12"]/div[@class="table-responsive"]/div[@class="responsive-info-table"]/div[@class="row with-top-border"]/div[@class="col-md-6"]/span[@class="small_content"]
It's always better to have these specific XPath
query's than the one with relative path like auto-generated by chrome dev tools
:
//html/body/div[1]/div/section[2]/div/div......
But you can remove the useless part, should be like:
(From chrome dev tools
, or firefox
console):
$x('//span[@class="small_content"]')[0].innerText
or in your case:
response.xpath('//span[@class="small_content"]/text()').extract_first()
Output
" Хайфа, Израиль"
答案2
得分: 0
CSS选择器
response.css('.small_content::text').get()
XPATH
response.xpath('//span[@class="small_content"]/text()').get()
英文:
you can use both CSS selector orXPATH
CSS selector
response.css('.small_content::text').get()
XPATH
response.xpath('//span[@class="small_content"]/text()').get()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论