英文:
Changing text in html with Golang without affecting html tags
问题
这是一个HTML代码:
<h2>相对URL</h2>
<p><a href="html_images.asp">HTML图片</a></p>
<p><a href="/css/default.asp">CSS教程</a></p>
我如何在不影响任何HTML标签的情况下使用Golang替换、更改大小写或对文本进行其他操作?例如:
<h2>相对URLS</h2>
<p><a href="html_images.asp">HTML图片</a></p>
<p><a href="/css/default.asp">CSS教程</a></p>
英文:
Here is an HTML code:
<h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p>
<p><a href="/css/default.asp">CSS Tutorial</a></p>
How can I replace, change case or do something with text without affecting any html tags using Golang? For example:
<h2>RELATIVE URLS</h2>
<p><a href="html_images.asp">HTML IMAGES</a></p>
<p><a href="/css/default.asp">CSS TUTORIAL</a></p>
答案1
得分: 1
你可以尝试使用一些基于XPath的解析器,比如htmlquery。
s := `<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>`
doc, _ := htmlquery.Parse(strings.NewReader(s))
fmt.Printf("更新前 \n%s\n", htmlquery.OutputHTML(doc, true))
nodes := htmlquery.Find(doc, "/html/body//*")
for _, node := range nodes {
if node.FirstChild.DataAtom == 0 {
// DataAtom 是 Data 的原子,如果 Data 不是已知的标签名称,则为零。
node.FirstChild.Data = strings.ToUpper(node.FirstChild.Data)
}
}
fmt.Printf("更新后 \n%s\n", htmlquery.OutputHTML(doc, true))
输出
更新前
<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>
更新后
<html><head></head><body><h2>RELATIVE URLS</h2>
<p><a href="html_images.asp">HTML IMAGES</a></p></body></html>
英文:
You can try some xpath based parser like htmlquery
s := `<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>`
doc, _ := htmlquery.Parse(strings.NewReader(s))
fmt.Printf("Before update \n%s\n", htmlquery.OutputHTML(doc, true))
nodes := htmlquery.Find(doc, "/html/body//*")
for _, node := range nodes {
if node.FirstChild.DataAtom == 0 {
// DataAtom is the atom for Data, or zero if Data is not a known tag name.
node.FirstChild.Data = strings.ToUpper(node.FirstChild.Data)
}
}
fmt.Printf("After update \n%s\n", htmlquery.OutputHTML(doc, true))
Output
Before update
<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>
After update
<html><head></head><body><h2>RELATIVE URLS</h2>
<p><a href="html_images.asp">HTML IMAGES</a></p></body></html>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论