英文:
Scraping all possible tags and putting them into one variable using Go Colly
问题
我需要从一系列网站中抓取不同的标签,将它们放入变量中,然后将它们放入一个.csv列表中。例如,所有提到文章作者的行(div.author,p.author等)。在所有网站上,这行的位置和标签都不同,所以我需要创建一个条件和正则表达式来过滤这些标签。
这是我的代码,我在其中找到一个可能的作者标签,并将其添加到articleCollection
中。我尝试了if和for条件,但无法将正确的变体放入author_name
变量中。
c.OnHTML("body", func(e *colly.HTMLElement) {
author_name := e.DOM.Find("div.author").Text()
if author_name == "" {
log.Println("Author not found \n")
}
author := Authors{
Author: author_name,
}
articleCollection = append(articleCollection, author)
})
另外,我尝试使用以下条件来查找所有带有作者类的<p>
标签,但它没有起作用,因为author_name
被声明但未使用:
if author_name == "" {
author_name := e.DOM.Find("p.author").Text()
}
谢谢。
英文:
I need to scrape different tags from a list of sites, put in variable and then put them in a .csv list. For example, all lines where the author of the article is mentioned (div.author, p.author etc). On all sites, the location of this line and the tags are different, so I need to create a conditional and regular expression to filter that tags.
This is my code, where I find 1 possible author tag and append it to articleCollection
. I tried if and for conditions, but can't put right variant it into author_name
variable.
c.OnHTML("body", func(e *colly.HTMLElement) {
author_name := e.DOM.Find("div.author").Text()
if author_name == "" {
log.Println("Author not found \n")
}
author := Authors{
Author: author_name,
}
articleCollection = append(articleCollection, author)
})
Also, I tried implement condition like this for find all <p> with author class, but it didn't work, because author_name declared and not used
:
if author_name == "" {
author_name := e.DOM.Find("p.author").Text()
}
Thank you.
答案1
得分: 0
使用以下代码替代:
if author_name == "" {
author_name = e.DOM.Find("p.author").Text()
}
而不是:
if author_name == "" {
author_name := e.DOM.Find("p.author").Text()
}
使用:=
会分配一个新的变量,在你的情况下,它是author_name
,一个只在该if
块内有效的新变量。而且你在声明变量后没有在任何地方使用它,这就是为什么会出现错误的原因。
英文:
Use
if author_name == "" {
author_name = e.DOM.Find("p.author").Text()
}
instead of
if author_name == "" {
author_name := e.DOM.Find("p.author").Text()
}
Using :=
will allocate a new variable, and in your case, it is author_name
, a new variable that is only valid within that if
block. and you are not using it on anything after declaring the variable, that is why the error comes up
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论