英文:
How to convert markdown to HTML in Golang with adding section tag
问题
我有以下的 Markdown 文本:
## Hello
### This is a test message
Ligisnfmkdfn
我使用 GO 模块 gomarkdown 将 Markdown 转换为带有 CommonExtensions 和 AutoHeadingIDs 解析器的 HTML,结果如下:
<h2 id="helo">Hello</h2>
<h3 id="this-is-a-test-message">This is a test message</h3>
<p>Ligisnfmkdfn</p>
请问我如何在 Node.js 中获得类似使用 markdown-it-header-sections 的结果呢?
<section id="helo">
<h2>Hello</h2>
<section id="this-is-a-test-message">
<h3>This is a test message</h3>
<p>Ligisnfmkdfn</p>
</section>
</section>
英文:
I have the markdown below
## Hello
### This is a test message
Ligisnfmkdfn
And I use the GO module gomarkdown to convert markdown to HTML with CommonExtensions and AutoHeadingIDs parser and I got the result are
<h2 id="helo">Hello</h2>
<h3 id="this-is-a-test-message">This is a test message</h3>
<p>Ligisnfmkdfn</p>
How can I get the result like using markdown-it-header-sections in nodejs
<section id="helo">
<h2>Hello</h2>
<section id="this-is-a-test-message">
<h3>This is a test message</h3>
<p>Ligisnfmkdfn</p>
</section>
</section>
答案1
得分: 4
这是一个相对完整的解决方案:
package main
import (
"fmt"
"io"
"regexp"
"strings"
"github.com/gomarkdown/markdown"
"github.com/gomarkdown/markdown/ast"
"github.com/gomarkdown/markdown/html"
)
// levels 跟踪标题的深度结构
var levels []int
func hasLevels() bool {
return len(levels) > 0
}
func lastLevel() int {
if hasLevels() {
return levels[len(levels)-1]
}
return 0
}
func popLevel() int {
level := lastLevel()
levels = levels[:len(levels)-1]
return level
}
func pushLevel(x int) {
levels = append(levels, x)
}
var reID = regexp.MustCompile(`\s+`)
// renderSections 捕获 ast.Heading 节点,并将节点及其“子”节点包装在 <section>...</section> 标签中;
// 在 Markdown 中没有真正的层次结构,所以我们通过以下方式构建一个层次结构:
// - H2 是 H1 的子节点,以此类推从 1 → 2 → 3 ... → N
// - H1 是另一个 H1 的同级节点
func renderSections(w io.Writer, node ast.Node, entering bool) (ast.WalkStatus, bool) {
openSection := func(level int, id string) {
w.Write([]byte(fmt.Sprintf("<section id=\"%s\">\n", id)))
pushLevel(level)
}
closeSection := func() {
w.Write([]byte("</section>\n"))
popLevel()
}
if _, ok := node.(*ast.Heading); ok {
level := node.(*ast.Heading).Level
if entering {
// 关闭比当前级别更深的标题-节;我们已经“上升”了一定数量的级别
for lastLevel() > level {
closeSection()
}
txtNode := node.GetChildren()[0]
if _, ok := txtNode.(*ast.Text); !ok {
panic(fmt.Errorf("expected txtNode to be *ast.Text; got %T", txtNode))
}
headTxt := string(txtNode.AsLeaf().Literal)
id := strings.ToLower(reID.ReplaceAllString(headTxt, "-"))
openSection(level, id)
}
}
// 在文档末尾
if _, ok := node.(*ast.Document); ok {
if !entering {
for hasLevels() {
closeSection()
}
}
}
// 继续正常处理
return ast.GoToNext, false
}
func main() {
lines := []string{
"## Hello",
"### This is a test message",
"Ligisnfmkdfn",
}
md := strings.Join(lines, "\n")
opts := html.RendererOptions{
Flags: html.CommonFlags,
RenderNodeHook: renderSections,
}
renderer := html.NewRenderer(opts)
html := markdown.ToHTML([]byte(md), nil, renderer)
fmt.Println(string(html))
}
运行上述代码,输出结果为:
<section id="hello">
<h2>Hello</h2>
<section id="this-is-a-test-message">
<h3>This is a test message</h3>
<p>Ligisnfmkdfn</p>
</section>
</section>
我称之为“相对完整”,因为它足够智能地处理以下输入:
lines := []string{
"# H1α",
"## H2A",
"## H2B",
"## H2C",
"### H31",
"#### H4I",
"## H2D",
"# H1β",
"## H2E",
}
并生成以下输出:
<section id="h1α">
<h1>H1α</h1>
<section id="h2a">
<h2>H2A</h2>
</section>
<section id="h2b">
<h2>H2B</h2>
</section>
<section id="h2c">
<h2>H2C</h2>
<section id="h31">
<h3>H31</h3>
<section id="h4i">
<h4>H4I</h4>
</section>
</section>
</section>
<section id="h2d">
<h2>H2D</h2>
</section>
</section>
<section id="h1β">
<h1>H1β</h1>
<section id="h2e">
<h2>H2E</h2>
</section>
</section>
但我没有进行严格测试,所以不确定它在哪些方面可能不符合预期。
英文:
Here's a moderately complete solution:
package main
import (
"fmt"
"io"
"regexp"
"strings"
"github.com/gomarkdown/markdown"
"github.com/gomarkdown/markdown/ast"
"github.com/gomarkdown/markdown/html"
)
// levels tracks how deep we are in a heading "structure"
var levels []int
func hasLevels() bool {
return len(levels) > 0
}
func lastLevel() int {
if hasLevels() {
return levels[len(levels)-1]
}
return 0
}
func popLevel() int {
level := lastLevel()
levels = levels[:len(levels)-1]
return level
}
func pushLevel(x int) {
levels = append(levels, x)
}
var reID = regexp.MustCompile(`\s+`)
// renderSections catches an ast.Heading node, and wraps the node
// and its "children" nodes in <section>...</section> tags; there's no
// real hierarchy in Markdown, so we make one up by saying things like:
// - H2 is a child of H1, and so forth from 1 → 2 → 3 ... → N
// - an H1 is a sibling of another H1
func renderSections(w io.Writer, node ast.Node, entering bool) (ast.WalkStatus, bool) {
openSection := func(level int, id string) {
w.Write([]byte(fmt.Sprintf("<section id=\"%s\">\n", id)))
pushLevel(level)
}
closeSection := func() {
w.Write([]byte("</section>\n"))
popLevel()
}
if _, ok := node.(*ast.Heading); ok {
level := node.(*ast.Heading).Level
if entering {
// close heading-sections deeper than this level; we've "come up" some number of levels
for lastLevel() > level {
closeSection()
}
txtNode := node.GetChildren()[0]
if _, ok := txtNode.(*ast.Text); !ok {
panic(fmt.Errorf("expected txtNode to be *ast.Text; got %T", txtNode))
}
headTxt := string(txtNode.AsLeaf().Literal)
id := strings.ToLower(reID.ReplaceAllString(headTxt, "-"))
openSection(level, id)
}
}
// at end of document
if _, ok := node.(*ast.Document); ok {
if !entering {
for hasLevels() {
closeSection()
}
}
}
// continue as normal
return ast.GoToNext, false
}
func main() {
lines := []string{
"## Hello",
"### This is a test message",
"Ligisnfmkdfn",
}
md := strings.Join(lines, "\n")
opts := html.RendererOptions{
Flags: html.CommonFlags,
RenderNodeHook: renderSections,
}
renderer := html.NewRenderer(opts)
html := markdown.ToHTML([]byte(md), nil, renderer)
fmt.Println(string(html))
}
When I run that, I get:
<section id="hello">
<h2>Hello</h2>
<section id="this-is-a-test-message">
<h3>This is a test message</h3>
<p>Ligisnfmkdfn</p>
</section>
</section>
I say it's moderately complete because it's smart enough to deal with input like this:
lines := []string{
"# H1α",
"## H2A",
"## H2B",
"## H2C",
"### H31",
"#### H4I",
"## H2D",
"# H1β",
"## H2E",
}
and it produces:
<section id="h1α">
<h1>H1α</h1>
<section id="h2a">
<h2>H2A</h2>
</section>
<section id="h2b">
<h2>H2B</h2>
</section>
<section id="h2c">
<h2>H2C</h2>
<section id="h31">
<h3>H31</h3>
<section id="h4i">
<h4>H4I</h4>
</section>
</section>
</section>
<section id="h2d">
<h2>H2D</h2>
</section>
</section>
<section id="h1β">
<h1>H1β</h1>
<section id="h2e">
<h2>H2E</h2>
</section>
</section>
But I haven't rigorously tested this, so I'm not sure where it might not meet expectations.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论