英文:
Any way to use html.Parse without it adding nodes to make a 'well-formed tree'?
问题
package main
import (
"bytes"
"code.google.com/p/go.net/html"
"fmt"
"log"
"strings"
)
func main() {
s := "Blah. Blah. Blah."
n, err := html.Parse(strings.NewReader(s))
if err != nil {
log.Fatalf("Parse error: %s", err)
}
var buf bytes.Buffer
if err := html.Render(&buf, n); err != nil {
log.Fatalf("Render error: %s", err)
}
fmt.Println(buf.String())
}
Output:
<html><head></head><body>Blah. <b>Blah.</b> Blah.</body></html>
有没有办法阻止html.Parse
将片段转换为文档(即避免添加<html>
,<body>
等)?我知道html.ParseFragment
但它似乎表现出相同的行为。
您可以通过将要解析的文本包装在父元素(例如<span>
)中,然后执行以下操作来解决此问题:
n = n.FirstChild.LastChild.FirstChild
但是这似乎有点笨拙。
理想情况下,我希望:接受输入,操作或删除其中找到的节点,并将结果写回字符串,即使结果是不完整的文档。
英文:
package main
import (
"bytes"
"code.google.com/p/go.net/html"
"fmt"
"log"
"strings"
)
func main() {
s := "Blah. <b>Blah.</b> Blah."
n, err := html.Parse(strings.NewReader(s))
if err != nil {
log.Fatalf("Parse error: %s", err)
}
var buf bytes.Buffer
if err := html.Render(&buf, n); err != nil {
log.Fatalf("Render error: %s", err)
}
fmt.Println(buf.String())
}
Output:
<html><head></head><body>Blah. <b>Blah.</b> Blah.</body></html>
Is there a way to stop html.Parse
from making a document out of fragments (ie avoid adding <html>
, <body>
etc.)? I'm aware of html.ParseFragment
but it seems to exhibit the same behaviour.
You can get around it by wrapping the text to be parsed with a parent element such as <span>
then doing something like the following:
n = n.FirstChild.LastChild.FirstChild
but that seems, well, kludgy to say the least.
Ideally I'd like to: accept input, manipulate or remove nodes found within it, and write the result back to a string, even if the result is an incomplete document.
答案1
得分: 13
您需要为ParseFragment提供上下文。以下程序打印出原始文本:
package main
import (
"bytes"
"code.google.com/p/go.net/html"
"code.google.com/p/go.net/html/atom"
"fmt"
"log"
"strings"
)
func main() {
s := "Blah. <b>Blah.</b> Blah."
n, err := html.ParseFragment(strings.NewReader(s), &html.Node{
Type: html.ElementNode,
Data: "body",
DataAtom: atom.Body,
})
if err != nil {
log.Fatalf("Parse error: %s", err)
}
var buf bytes.Buffer
for _, node := range n {
if err := html.Render(&buf, node); err != nil {
log.Fatalf("Render error: %s", err)
}
}
fmt.Println(buf.String())
}
英文:
You need to provide a context to ParseFragment. The following program prints out the original text:
package main
import (
"bytes"
"code.google.com/p/go.net/html"
"code.google.com/p/go.net/html/atom"
"fmt"
"log"
"strings"
)
func main() {
s := "Blah. <b>Blah.</b> Blah."
n, err := html.ParseFragment(strings.NewReader(s), &html.Node{
Type: html.ElementNode,
Data: "body",
DataAtom: atom.Body,
})
if err != nil {
log.Fatalf("Parse error: %s", err)
}
var buf bytes.Buffer
for _, node := range n {
if err := html.Render(&buf, node); err != nil {
log.Fatalf("Render error: %s", err)
}
}
fmt.Println(buf.String())
}
答案2
得分: 6
你想要使用http://godoc.org/code.google.com/p/go.net/html#ParseFragment。将一个假的Body元素作为你的上下文传入,片段将作为你的片段中的元素的切片返回。
你可以在go-html-transform的go.net/html包的Partial*函数中看到一个例子。https://code.google.com/p/go-html-transform/source/browse/h5/h5.go#32
英文:
You want http://godoc.org/code.google.com/p/go.net/html#ParseFragment. Pass in a fake Body element as your context and the fragment will be returned as a slice of just the elements in your fragment.
You can see an example in the Partial* functions for go-html-transform's go.net/html wrapper package. https://code.google.com/p/go-html-transform/source/browse/h5/h5.go#32
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论