英文:
how to scrape attribute in attibute with colly
问题
我尝试抓取一个产品的productId,但是无法成功。请帮忙。
HTML代码如下:
<span class="info">
<button data-product="{"merchantName":"xxx","price":"11","productName":"car window","categoryName":"windows","brandName":"aa assosiations","productId":"which I want to scrape"}">
当我尝试使用以下代码时:
h.ChildAttr("span.info>button", "data-product")
结果是{"merchantName":"xxx","price":"11","productName":"car window","categoryName":"windows","brandName":"aa assosiations","productId":"which I want to scrape"}
而当我尝试使用以下代码时:
h.ChildAttr("span.info>button", "productId")
没有结果。我该如何使用colly获取这个数据?
英文:
I try to scrape productId of a product but i can not. please help
html code
<span class="info">
<button data-product="{"merchantName":"xxx","price":"11","productName":"car window","categoryName":"windows","brandName":"aa assosiations","productId":"which I want to scrape"}">
when I try
h.ChildAttr("span.info>button", "data-product")
result is {"merchantName":"xxx","price":"11","productName":"car window","categoryName":"windows","brandName":"aa assosiations","productId":"which I want to scrape"}
and when I try
h.ChildAttr("span.info>button", "productId")
there is no result.
how can I get this data with colly?
答案1
得分: 0
属性值是原始值,在这种情况下,它是以JSON格式呈现的,因此您需要解析JSON以正确获取数据。
例如:
package main
import (
"log"
"encoding/json"
"github.com/gocolly/colly"
)
func main() {
c := colly.NewCollector()
c.OnHTML(`body`, func(e *colly.HTMLElement) {
text := e.ChildAttr("span.info>button", "data-product")
var result map[string]interface{}
err := json.Unmarshal([]byte(text), &result)
if err != nil {
log.Println(err)
return
}
log.Println(result["productId"])
})
c.Visit("[某个URL]")
}
输出
2021/10/21 14:23:24 我想要抓取的内容
英文:
The attribute value is a raw value, and in this case, it's in JSON format, so you will need to parse the JSON in order to correctly get the data.
For example:
package main
import (
"log"
"encoding/json"
"github.com/gocolly/colly"
)
func main() {
c := colly.NewCollector()
c.OnHTML(`body`, func(e *colly.HTMLElement) {
text := e.ChildAttr("span.info>button", "data-product")
var result map[string]interface{}
err := json.Unmarshal([]byte(text), &result)
if err != nil {
log.Println(err)
return
}
log.Println(result["productId"])
})
c.Visit("[some url]")
}
Output
2021/10/21 14:23:24 which I want to scrape
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论