How to convert markdown to HTML in Golang with adding section tag

huangapple go评论126阅读模式
英文:

How to convert markdown to HTML in Golang with adding section tag

问题

我有以下的 Markdown 文本:

  1. ## Hello
  2. ### This is a test message
  3. Ligisnfmkdfn

我使用 GO 模块 gomarkdown 将 Markdown 转换为带有 CommonExtensions 和 AutoHeadingIDs 解析器的 HTML,结果如下:

  1. <h2 id="helo">Hello</h2>
  2. <h3 id="this-is-a-test-message">This is a test message</h3>
  3. <p>Ligisnfmkdfn</p>

请问我如何在 Node.js 中获得类似使用 markdown-it-header-sections 的结果呢?

  1. <section id="helo">
  2. <h2>Hello</h2>
  3. <section id="this-is-a-test-message">
  4. <h3>This is a test message</h3>
  5. <p>Ligisnfmkdfn</p>
  6. </section>
  7. </section>
英文:

I have the markdown below

  1. ## Hello
  2. ### This is a test message
  3. Ligisnfmkdfn

And I use the GO module gomarkdown to convert markdown to HTML with CommonExtensions and AutoHeadingIDs parser and I got the result are

  1. <h2 id="helo">Hello</h2>
  2. <h3 id="this-is-a-test-message">This is a test message</h3>
  3. <p>Ligisnfmkdfn</p>

How can I get the result like using markdown-it-header-sections in nodejs

  1. <section id="helo">
  2. <h2>Hello</h2>
  3. <section id="this-is-a-test-message">
  4. <h3>This is a test message</h3>
  5. <p>Ligisnfmkdfn</p>
  6. </section>
  7. </section>

答案1

得分: 4

这是一个相对完整的解决方案:

  1. package main
  2. import (
  3. "fmt"
  4. "io"
  5. "regexp"
  6. "strings"
  7. "github.com/gomarkdown/markdown"
  8. "github.com/gomarkdown/markdown/ast"
  9. "github.com/gomarkdown/markdown/html"
  10. )
  11. // levels 跟踪标题的深度结构
  12. var levels []int
  13. func hasLevels() bool {
  14. return len(levels) > 0
  15. }
  16. func lastLevel() int {
  17. if hasLevels() {
  18. return levels[len(levels)-1]
  19. }
  20. return 0
  21. }
  22. func popLevel() int {
  23. level := lastLevel()
  24. levels = levels[:len(levels)-1]
  25. return level
  26. }
  27. func pushLevel(x int) {
  28. levels = append(levels, x)
  29. }
  30. var reID = regexp.MustCompile(`\s+`)
  31. // renderSections 捕获 ast.Heading 节点,并将节点及其“子”节点包装在 <section>...</section> 标签中;
  32. // 在 Markdown 中没有真正的层次结构,所以我们通过以下方式构建一个层次结构:
  33. // - H2 是 H1 的子节点,以此类推从 1 → 2 → 3 ... → N
  34. // - H1 是另一个 H1 的同级节点
  35. func renderSections(w io.Writer, node ast.Node, entering bool) (ast.WalkStatus, bool) {
  36. openSection := func(level int, id string) {
  37. w.Write([]byte(fmt.Sprintf("<section id=\"%s\">\n", id)))
  38. pushLevel(level)
  39. }
  40. closeSection := func() {
  41. w.Write([]byte("</section>\n"))
  42. popLevel()
  43. }
  44. if _, ok := node.(*ast.Heading); ok {
  45. level := node.(*ast.Heading).Level
  46. if entering {
  47. // 关闭比当前级别更深的标题-节;我们已经“上升”了一定数量的级别
  48. for lastLevel() > level {
  49. closeSection()
  50. }
  51. txtNode := node.GetChildren()[0]
  52. if _, ok := txtNode.(*ast.Text); !ok {
  53. panic(fmt.Errorf("expected txtNode to be *ast.Text; got %T", txtNode))
  54. }
  55. headTxt := string(txtNode.AsLeaf().Literal)
  56. id := strings.ToLower(reID.ReplaceAllString(headTxt, "-"))
  57. openSection(level, id)
  58. }
  59. }
  60. // 在文档末尾
  61. if _, ok := node.(*ast.Document); ok {
  62. if !entering {
  63. for hasLevels() {
  64. closeSection()
  65. }
  66. }
  67. }
  68. // 继续正常处理
  69. return ast.GoToNext, false
  70. }
  71. func main() {
  72. lines := []string{
  73. "## Hello",
  74. "### This is a test message",
  75. "Ligisnfmkdfn",
  76. }
  77. md := strings.Join(lines, "\n")
  78. opts := html.RendererOptions{
  79. Flags: html.CommonFlags,
  80. RenderNodeHook: renderSections,
  81. }
  82. renderer := html.NewRenderer(opts)
  83. html := markdown.ToHTML([]byte(md), nil, renderer)
  84. fmt.Println(string(html))
  85. }

运行上述代码,输出结果为:

  1. <section id="hello">
  2. <h2>Hello</h2>
  3. <section id="this-is-a-test-message">
  4. <h3>This is a test message</h3>
  5. <p>Ligisnfmkdfn</p>
  6. </section>
  7. </section>

我称之为“相对完整”,因为它足够智能地处理以下输入:

  1. lines := []string{
  2. "# H1α",
  3. "## H2A",
  4. "## H2B",
  5. "## H2C",
  6. "### H31",
  7. "#### H4I",
  8. "## H2D",
  9. "# H1β",
  10. "## H2E",
  11. }

并生成以下输出:

  1. <section id="h1α">
  2. <h1>H1α</h1>
  3. <section id="h2a">
  4. <h2>H2A</h2>
  5. </section>
  6. <section id="h2b">
  7. <h2>H2B</h2>
  8. </section>
  9. <section id="h2c">
  10. <h2>H2C</h2>
  11. <section id="h31">
  12. <h3>H31</h3>
  13. <section id="h4i">
  14. <h4>H4I</h4>
  15. </section>
  16. </section>
  17. </section>
  18. <section id="h2d">
  19. <h2>H2D</h2>
  20. </section>
  21. </section>
  22. <section id="h1β">
  23. <h1>H1β</h1>
  24. <section id="h2e">
  25. <h2>H2E</h2>
  26. </section>
  27. </section>

但我没有进行严格测试,所以不确定它在哪些方面可能不符合预期。

英文:

Here's a moderately complete solution:

  1. package main
  2. import (
  3. &quot;fmt&quot;
  4. &quot;io&quot;
  5. &quot;regexp&quot;
  6. &quot;strings&quot;
  7. &quot;github.com/gomarkdown/markdown&quot;
  8. &quot;github.com/gomarkdown/markdown/ast&quot;
  9. &quot;github.com/gomarkdown/markdown/html&quot;
  10. )
  11. // levels tracks how deep we are in a heading &quot;structure&quot;
  12. var levels []int
  13. func hasLevels() bool {
  14. return len(levels) &gt; 0
  15. }
  16. func lastLevel() int {
  17. if hasLevels() {
  18. return levels[len(levels)-1]
  19. }
  20. return 0
  21. }
  22. func popLevel() int {
  23. level := lastLevel()
  24. levels = levels[:len(levels)-1]
  25. return level
  26. }
  27. func pushLevel(x int) {
  28. levels = append(levels, x)
  29. }
  30. var reID = regexp.MustCompile(`\s+`)
  31. // renderSections catches an ast.Heading node, and wraps the node
  32. // and its &quot;children&quot; nodes in &lt;section&gt;...&lt;/section&gt; tags; there&#39;s no
  33. // real hierarchy in Markdown, so we make one up by saying things like:
  34. // - H2 is a child of H1, and so forth from 1 → 2 → 3 ... → N
  35. // - an H1 is a sibling of another H1
  36. func renderSections(w io.Writer, node ast.Node, entering bool) (ast.WalkStatus, bool) {
  37. openSection := func(level int, id string) {
  38. w.Write([]byte(fmt.Sprintf(&quot;&lt;section id=\&quot;%s\&quot;&gt;\n&quot;, id)))
  39. pushLevel(level)
  40. }
  41. closeSection := func() {
  42. w.Write([]byte(&quot;&lt;/section&gt;\n&quot;))
  43. popLevel()
  44. }
  45. if _, ok := node.(*ast.Heading); ok {
  46. level := node.(*ast.Heading).Level
  47. if entering {
  48. // close heading-sections deeper than this level; we&#39;ve &quot;come up&quot; some number of levels
  49. for lastLevel() &gt; level {
  50. closeSection()
  51. }
  52. txtNode := node.GetChildren()[0]
  53. if _, ok := txtNode.(*ast.Text); !ok {
  54. panic(fmt.Errorf(&quot;expected txtNode to be *ast.Text; got %T&quot;, txtNode))
  55. }
  56. headTxt := string(txtNode.AsLeaf().Literal)
  57. id := strings.ToLower(reID.ReplaceAllString(headTxt, &quot;-&quot;))
  58. openSection(level, id)
  59. }
  60. }
  61. // at end of document
  62. if _, ok := node.(*ast.Document); ok {
  63. if !entering {
  64. for hasLevels() {
  65. closeSection()
  66. }
  67. }
  68. }
  69. // continue as normal
  70. return ast.GoToNext, false
  71. }
  72. func main() {
  73. lines := []string{
  74. &quot;## Hello&quot;,
  75. &quot;### This is a test message&quot;,
  76. &quot;Ligisnfmkdfn&quot;,
  77. }
  78. md := strings.Join(lines, &quot;\n&quot;)
  79. opts := html.RendererOptions{
  80. Flags: html.CommonFlags,
  81. RenderNodeHook: renderSections,
  82. }
  83. renderer := html.NewRenderer(opts)
  84. html := markdown.ToHTML([]byte(md), nil, renderer)
  85. fmt.Println(string(html))
  86. }

When I run that, I get:

  1. &lt;section id=&quot;hello&quot;&gt;
  2. &lt;h2&gt;Hello&lt;/h2&gt;
  3. &lt;section id=&quot;this-is-a-test-message&quot;&gt;
  4. &lt;h3&gt;This is a test message&lt;/h3&gt;
  5. &lt;p&gt;Ligisnfmkdfn&lt;/p&gt;
  6. &lt;/section&gt;
  7. &lt;/section&gt;

I say it's moderately complete because it's smart enough to deal with input like this:

  1. lines := []string{
  2. &quot;# H1α&quot;,
  3. &quot;## H2A&quot;,
  4. &quot;## H2B&quot;,
  5. &quot;## H2C&quot;,
  6. &quot;### H31&quot;,
  7. &quot;#### H4I&quot;,
  8. &quot;## H2D&quot;,
  9. &quot;# H1β&quot;,
  10. &quot;## H2E&quot;,
  11. }

and it produces:

  1. &lt;section id=&quot;h1α&quot;&gt;
  2. &lt;h1&gt;H1α&lt;/h1&gt;
  3. &lt;section id=&quot;h2a&quot;&gt;
  4. &lt;h2&gt;H2A&lt;/h2&gt;
  5. &lt;/section&gt;
  6. &lt;section id=&quot;h2b&quot;&gt;
  7. &lt;h2&gt;H2B&lt;/h2&gt;
  8. &lt;/section&gt;
  9. &lt;section id=&quot;h2c&quot;&gt;
  10. &lt;h2&gt;H2C&lt;/h2&gt;
  11. &lt;section id=&quot;h31&quot;&gt;
  12. &lt;h3&gt;H31&lt;/h3&gt;
  13. &lt;section id=&quot;h4i&quot;&gt;
  14. &lt;h4&gt;H4I&lt;/h4&gt;
  15. &lt;/section&gt;
  16. &lt;/section&gt;
  17. &lt;/section&gt;
  18. &lt;section id=&quot;h2d&quot;&gt;
  19. &lt;h2&gt;H2D&lt;/h2&gt;
  20. &lt;/section&gt;
  21. &lt;/section&gt;
  22. &lt;section id=&quot;h1β&quot;&gt;
  23. &lt;h1&gt;H1β&lt;/h1&gt;
  24. &lt;section id=&quot;h2e&quot;&gt;
  25. &lt;h2&gt;H2E&lt;/h2&gt;
  26. &lt;/section&gt;
  27. &lt;/section&gt;

But I haven't rigorously tested this, so I'm not sure where it might not meet expectations.

huangapple
  • 本文由 发表于 2022年9月16日 18:10:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/73743233.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定