如何调试boost::spirit?

huangapple go评论93阅读模式
英文:

How to debug boost::spirit?

问题

我已经阅读了文档。
我还执行了calc_debug.cpp示例,它提供了比我的代码更易读的数据。
对于每个语法,我都使用以下代码:

m_sGrammar.name("property");
BOOST_SPIRIT_DEBUG_NODE(m_sGrammar);

我得到的是一种XML输出,而且有很多输出,我不知道要查找什么。

例如,关键字"fail"已经打印出了文件开头的注释。

例如,有一个关键字"success"被打印出来,它显示了匹配的文本(涵盖了多个不同的规则),但它没有指定匹配了哪个规则。
相同的文本稍后带有关键字"try"。

英文:

I've read the documentation.
I also executed the calc_debug.cpp example and it offers much more readable data than my code.
I'm using for every grammar

m_sGrammar.name("property");
BOOST_SPIRIT_DEBUG_NODE(m_sGrammar);

What I'm getting is some kind of xml output, and lots of it and I don't have any clue what to look for.

E.g. the keyword fail is already printed for what are comments at the beginning of the file.

e.g. There is a keyword success being printed and it shows the text matching (which covers multiple separate rules) but it does not specify, which rule was matched.
The same text is later mentioned with keyword try.

答案1

得分: 1

调试输出需要耐心。

一如既往。手动步进代码,追踪寄存器和变量需要耐心,绘制数据示意图和跟踪流程需要耐心,阅读汇编代码并查看时间花在哪里也需要耐心。编程是一项令人沮丧的任务,需要很多耐心。

调试输出只是这个过程中的一个工具。

您可以应用这个工具,以便您可以拥有一种“状态流程图”,而不必费力地遍历涉及的奥秘模板实例。当然,这意味着您会有很多数据,但您绝对可以学会更好地处理它,而不是进行易失状态调试(很少能够返回并检查较早的状态)。

换句话说,调试信息让您可以像实现它一样声明性地调试解析器,而不是命令式地。

以下是我使用的技巧:

  1. 减小输入。将其减小到只发生您想要理解的事情。这可能是为什么某些内容无法解析或者为什么被意外的规则接受。

  2. 最小化输出!

    • 如果不需要查看跳过器的调试信息,就不要启用它。根据我的经验,“源码预读”已经暗示了已经跳过了什么。只有在这让我感到惊讶的时候,我才会调试跳过器。

    • 使用支持XML的编辑器,可以轻松“折叠”您已经“看到”的子树,也许已经决定是“如预期”的。

      例如,在vim中,我经常从折叠掉不感兴趣的分支开始,一旦找到我要查找的内容,我就会删除这些分支(d,a,t,zf,at,vata)。当我需要再次进行此操作时,我使用信息进一步减小输入。

  3. 认识到失败是正常的。在规则 R = (a|b|c) 中,如果输入仅匹配 c,您将期望看到 ab 失败。当然,关键是 R 本身成功。实际上,如果 b 成功,调试输出中甚至不会出现 c(因为不需要尝试它)。

  4. 从后面开始,找到最后一个非回溯规则的词法单元/成员失败。在重复构造中 (*a, +a, repeat(n, m)[a]a % b),最后一个成员在结束之前始终会失败(所有它们的分支)。

    来自我昨天的回答的一个示例,成功解析 *task_item 的结尾:

    ...
    <attributes>[[[[[[V, a, r, 1], [=, =], [T, e, s, t]], [&, &], [[[V, a, r, 2], [<, =], 10], [&, &], [[V, a, r, 3]
    , [=, =], [D, o, n, e]]]], [[[W, o, r, d], 32, [O, b, j, e, c, t, i, v, e]], [[[[V, a, r, 3], [=, =], [A]], [|, |], 
    [[[V, a, r, 4], [=, =], [B]], [&, &], [[V, a, r, 5], [>], 0]]], [[[V, a, r, N, a, m, e], [V, a, l, u, e, 1]], [[V, a
    , r, 2], 10]], [[[[V, a, r, 3], [=, =], [C]], [[[V, a, r, N, a, m, e], [S, o, m, e, V, a, l, u, e]]], [empty]]]]], [
    [[[V, a, r, N, a, m, e], [V, a, l, u, e, 2]]]]]]]</attributes>
    </task_item>
    <task_item>
    <try></try>
    <classdef_>
    <try></try>
    <fail/>
    </classdef_>
    <statement_>
    <try></try>
    <assign_>
    <try></try>
    <assign_>
    <try></try>
    <fail/>
    </assign_>
    <fail/>
    </assign_>
    <verify_>
    <try></try>
    <fail/>
    </verify_>
    <conditional_>
    <try></try>
    <fail/>
    </conditional_>
    <fail/>
    </statement_>
    <fail/>
    </task_item>
    <success></success>
    <attributes>[[[[[[V, a, r, 1], [=, =], [T, e, s, t]], [&, &], [[[V, a, r, 2], [<, =], 10], [&, &], [[V, a, r, 3], 
    [=, =], [D, o, n, e]]]], [[[W, o, r, d], 32, [O, b, j, e, c, t, i, v, e]], [[[[V, a, r, 3], [=, =], [A]],
    
    
英文:

Reading the debug output takes patience.

As always. It also takes patience manually stepping through code, tracing registers and variables, it takes patience diagramming your data and tracking the flow, it takes patience to read assembly and seeing where time is being spent. Programming is a frustrating task that takes a lot of patience.

The debug output is also just a tool in that process.

You apply the tool so that you can have a kind of "state flow diagram" instead of painstakingly stepping through the arcane template instances involved. Of course, it means that you have a lot of data, but you definitely can learn to handle that a lot better than volatile state debugging (where you can rarely go back and check the state at an earlier).

> In other words, the debug information let's you debug the parser in the same way you implement it: declaratively, instead of imperatively.

Here are tricks that I use:

  1. Minimize your input. Reduce it until on the thing that you want to understand happens. This could be why something fails to parse, or why it is being accepted (by an unexpected rule).

  2. Minimize the output!

    • Don't enable debug for your skipper if you don't need to see it. In my experience the "source lookahead" already implies what has been skipped. Only if that surprises me, I might debug the skipper too

    • Use an XML-aware editor that makes it easy to "fold" subtrees that you have "seen", perhaps decided are "as expected".

      E.g. in vim I often start out with folding the uninteresting branches,
      and once I have found where I am looking I'll actually delete those
      branches (<kbd>d</kbd><kbd>a</kbd><kbd>t</kbd>,
      <kbd>z</kbd><kbd>f</kbd><kbd>a</kbd><kbd>t</kbd>,
      <kbd>v</kbd><kbd>a</kbd><kbd>t</kbd><kbd>a</kbd><kbd>t</kbd>). When I
      need to repeat this, I use the information to minimize the input further.

  3. Realize that failure is normal. In the rule R = (a|b|c) you will expect to see a and b fail if the input only matches c. Of course, the point is that R itself succeeds. In fact, if b succeeds, c will not even appear in the debug output (as it doesn't need to be tried).

  4. Start from the back, finding the last lexeme/member of a non-backtracking rule that fails. In repeating constructions (*a, +a, repeat(n, m)[a] and a % b) the last member will always fail (all their branches) before it knows to end.

    An example from my answer yesterday, the end of a successful *task_item parse:

        ...
        &lt;attributes&gt;[[[[[[V, a, r, 1], [=, =], [T, e, s, t]], [&amp;, &amp;], [[[V, a, r, 2], [&lt;, =], 10], [&amp;, &amp;], [[V, a, r, 3]
    , [=, =], [D, o, n, e]]]], [[[W, o, r, d], 32, [O, b, j, e, c, t, i, v, e]], [[[[V, a, r, 3], [=, =], [A]], [|, |], 
    [[[V, a, r, 4], [=, =], [B]], [&amp;, &amp;], [[V, a, r, 5], [&gt;], 0]]], [[[V, a, r, N, a, m, e], [V, a, l, u, e, 1]], [[V, a
    , r, 2], 10]], [[[[V, a, r, 3], [=, =], [C]], [[[V, a, r, N, a, m, e], [S, o, m, e, V, a, l, u, e]]], [empty]]]]], [
    [[[V, a, r, N, a, m, e], [V, a, l, u, e, 2]]]]]]]&lt;/attributes&gt;
      &lt;/task_item&gt;
      &lt;task_item&gt;
        &lt;try&gt;&lt;/try&gt;
        &lt;classdef_&gt;
          &lt;try&gt;&lt;/try&gt;
          &lt;fail/&gt;
        &lt;/classdef_&gt;
        &lt;statement_&gt;
          &lt;try&gt;&lt;/try&gt;
          &lt;assign_&gt;
            &lt;try&gt;&lt;/try&gt;
            &lt;assign_&gt;
              &lt;try&gt;&lt;/try&gt;
              &lt;fail/&gt;
            &lt;/assign_&gt;
            &lt;fail/&gt;
          &lt;/assign_&gt;
          &lt;verify_&gt;
            &lt;try&gt;&lt;/try&gt;
            &lt;fail/&gt;
          &lt;/verify_&gt;
          &lt;conditional_&gt;
            &lt;try&gt;&lt;/try&gt;
            &lt;fail/&gt;
          &lt;/conditional_&gt;
          &lt;fail/&gt;
        &lt;/statement_&gt;
        &lt;fail/&gt;
      &lt;/task_item&gt;
      &lt;success&gt;&lt;/success&gt;
      &lt;attributes&gt;[[[[[[V, a, r, 1], [=, =], [T, e, s, t]], [&amp;, &amp;], [[[V, a, r, 2], [&lt;, =], 10], [&amp;, &amp;], [[V, a, r, 3], 
    [=, =], [D, o, n, e]]]], [[[W, o, r, d], 32, [O, b, j, e, c, t, i, v, e]], [[[[V, a, r, 3], [=, =], [A]], [|, |], [[
    [V, a, r, 4], [=, =], [B]], [&amp;, &amp;], [[V, a, r, 5], [&gt;], 0]]], [[[V, a, r, N, a, m, e], [V, a, l, u, e, 1]], [[V, a, 
    r, 2], 10]], [[[[V, a, r, 3], [=, =], [C]], [[[V, a, r, N, a, m, e], [S, o, m, e, V, a, l, u, e]]], [empty]]]]], [[[
    [V, a, r, N, a, m, e], [V, a, l, u, e, 2]]]]]]]&lt;/attributes&gt;
    &lt;/task_&gt;
    

    In an unsuccessful parse, you will usually have to find for the last specific rule that fails where you expected it to pass.

  5. Keep your rules simple. I've seen your skipper once and realize that its complexity might be why you are debugging it.

  6. DON'T debug! Instead, have good error reporting for expected errors. Instead of giving examples, I'll just refer to this answer that already does: https://stackoverflow.com/questions/76465101/boost-spirit-assert-exception-can-this-be-used-for-reporting-a-parser-error/76466014#76466014

E.g. in that same grammar I worked on yesterday, if you accidentally put Else if instead of the expected Elseif in the input, you will get this error message:

     -&gt; EXPECTED &lt;eoi&gt; in line:2
            If (Var1 == &quot;Test&quot;) &amp;&amp; (Var2 &lt;= 10) &amp;&amp; (Var3 == &quot;Done&quot;)
            ^--- here

This could be enough, or help a lot when mavigating the lowlevel debug output.


The only tangible complaint I see from your question:
> E.g. the keyword fail is already printed for what are comments at the beginning of the file.

probably means you can remove the skipper from the debugged set.

UPDATE to the edit:

> e.g. There is a keyword success being printed and it shows the text
> matching (which covers multiple rules) but it does not specify, which
> rule was matched.

That's inaccurate. Success shows the input location reached after the match. It does NOT show the text matching. It shows the text remaining. The text "matched" is no longer of interest, instead it shows the result of the match under attributes (again see the example above)

huangapple
  • 本文由 发表于 2023年6月22日 20:17:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76531813.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定