使用Golang在不影响HTML标签的情况下更改HTML中的文本。

huangapple go评论97阅读模式
英文:

Changing text in html with Golang without affecting html tags

问题

这是一个HTML代码:

<h2>相对URL</h2>
<p><a href="html_images.asp">HTML图片</a></p>
<p><a href="/css/default.asp">CSS教程</a></p>

我如何在不影响任何HTML标签的情况下使用Golang替换、更改大小写或对文本进行其他操作?例如:

<h2>相对URLS</h2>
<p><a href="html_images.asp">HTML图片</a></p>
<p><a href="/css/default.asp">CSS教程</a></p>
英文:

Here is an HTML code:

<h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p>
<p><a href="/css/default.asp">CSS Tutorial</a></p>

How can I replace, change case or do something with text without affecting any html tags using Golang? For example:

<h2>RELATIVE URLS</h2>
<p><a href="html_images.asp">HTML IMAGES</a></p>
<p><a href="/css/default.asp">CSS TUTORIAL</a></p>

答案1

得分: 1

你可以尝试使用一些基于XPath的解析器,比如htmlquery

s := `<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>`

doc, _ := htmlquery.Parse(strings.NewReader(s))
fmt.Printf("更新前 \n%s\n", htmlquery.OutputHTML(doc, true))

nodes := htmlquery.Find(doc, "/html/body//*")

for _, node := range nodes {
  if node.FirstChild.DataAtom == 0 { 
    // DataAtom 是 Data 的原子,如果 Data 不是已知的标签名称,则为零。
    node.FirstChild.Data = strings.ToUpper(node.FirstChild.Data)
  }
}
fmt.Printf("更新后 \n%s\n", htmlquery.OutputHTML(doc, true))

输出

更新前 
<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>
更新后 
<html><head></head><body><h2>RELATIVE URLS</h2>
<p><a href="html_images.asp">HTML IMAGES</a></p></body></html>
英文:

You can try some xpath based parser like htmlquery

s := `<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>`

doc, _ := htmlquery.Parse(strings.NewReader(s))
fmt.Printf("Before update \n%s\n", htmlquery.OutputHTML(doc, true))

nodes := htmlquery.Find(doc, "/html/body//*")

for _, node := range nodes {
  if node.FirstChild.DataAtom == 0 { 
    // DataAtom is the atom for Data, or zero if Data is not a known tag name.
    node.FirstChild.Data = strings.ToUpper(node.FirstChild.Data)
  }
}
fmt.Printf("After update \n%s\n", htmlquery.OutputHTML(doc, true))

Output

Before update 
<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>
After update 
<html><head></head><body><h2>RELATIVE URLS</h2>
<p><a href="html_images.asp">HTML IMAGES</a></p></body></html>

huangapple
  • 本文由 发表于 2022年10月8日 15:57:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/73995340.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定