英文:
Succinct grammar to parse strings of alternating characters like "ababa" or "baba"
问题
我正在使用golang编写一个玩具解析器,只是为了学习这门语言。我添加了一个包含以下情况的语法测试用例:
有效:
a, ab, aba, ababababababa, ababababab
b, ba, bab, babababababab, bababababa
无效:
abb, baa
a
总是跟随着b
,反之亦然。
现在我的解析器中的语法如下(为了简洁起见,我省略了周围的代码):
"expr": Or(Ref("A"), Ref("B")),
"A": And(
a,
Optional(
And(
b,
Optional(Ref("A"))))),
"B": And(
b,
Optional(Ref("A")))
其中:
a - 精确匹配字符"a"
b - 精确匹配字符"b"
"A"、"B"、"expr" - 可以在后面使用Ref("A")引用的语法部分的名称
And - 消耗表达式序列
Or - 消耗任意一个表达式
Ref - 引用其他表达式(允许递归)
Optional - 使表达式变为非必需的
我猜这不是描述这个语法最简洁的方式。如何使它更紧凑?
相关链接:
- https://stackoverflow.com/questions/847439/why-cant-a-recursive-descent-parser-handle-left-recursion
编辑:
Filip的BNF 答案 可以用我的语法写成:
"expr": Or(Ref("A"), Ref("B")),
"A": Or(And(a, Ref("B")), a),
"B": Or(And(b, Ref("A")), b)
英文:
I am working on a toy parser in golang just to learn the language. I added a test case with grammar covering the following cases:
Valid:
a, ab, aba, ababababababa, ababababab
b, ba, bab, babababababab, bababababa
Invalid:
abb, baa
a
is always followed by b
and vice versa.
Now the grammar in my parser looks like that (I omit the surrounding code for sake of brevity):
"expr": Or(Ref("A"), Ref("B")),
"A": And(
a,
Optional(
And(
b,
Optional(Ref("A"))))),
"B": And(
b,
Optional(Ref("A")))
Where
a - exact match for "a" character
b - exact match for "b" character
"A", "B", "expr" - names of the parts of the grammar that can be referred
later with Ref("A")
And - consume sequence of expressions
Or - consume any of the expressions
Ref - refer to other expression (allows recursion)
Optional - make the expression non-obligatory
I guess it is not the most succinct way to describe this grammar. How to make it more compact?
Related:
- https://stackoverflow.com/questions/847439/why-cant-a-recursive-descent-parser-handle-left-recursion
EDIT:
The BNF answer from Filip can be written in my syntax as:
"expr": Or(Ref("A"), Ref("B")),
"A": Or(And(a, Ref("B")), a),
"B": Or(And(b, Ref("A")), b)
答案1
得分: 3
你提供的BNF语法如下:
expr ::= A | B
A ::= "a" B | "a"
B ::= "b" A | "b"
我认为使用你的语法可以将其翻译为:
"expr": Or(Ref("A"), Ref("B")),
"A": And(
a,
Optional(Ref("B"))),
"B": And(
b,
Optional(Ref("A")))
请注意,在检查非终结符(Ref(x)
)之前,检查终结符("a"
,"b"
)是很重要的,否则会导致无限循环。它会一直尝试查看是否可以将另一个A
或B
匹配到字符串的末尾,然后再匹配另一个,再匹配另一个,导致无休止的递归。
英文:
The BNF grammar you have is this:
expr ::= A | B
A ::= "a" B | "a"
B ::= "b" A | "b"
which I think translates to this using your syntax:
"expr": Or(Ref("A"), Ref("B")),
"A": And(
a,
Optional(Ref("B"))),
"B": And(
b,
Optional(Ref("A")))
Note that it is important to check terminals ("a"
, "b"
) before non-terminals (Ref(x)
), or you'll get an infinite loop. It would always try to see if it could match another A
or B
to the end of the string, and then another, and another, causing a never ending recursion.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论