简洁的语法用于解析交替字符的字符串,例如”ababa”或”baba”。

huangapple go评论81阅读模式
英文:

Succinct grammar to parse strings of alternating characters like "ababa" or "baba"

问题

我正在使用golang编写一个玩具解析器,只是为了学习这门语言。我添加了一个包含以下情况的语法测试用例:

有效:
a, ab, aba, ababababababa, ababababab
b, ba, bab, babababababab, bababababa

无效:
abb, baa

a总是跟随着b,反之亦然。

现在我的解析器中的语法如下(为了简洁起见,我省略了周围的代码):

	"expr": Or(Ref("A"), Ref("B")),
	"A": And(
		a,
		Optional(
			And(
				b,
				Optional(Ref("A"))))),
	"B": And(
		b,
		Optional(Ref("A")))

其中:

a - 精确匹配字符"a"
b - 精确匹配字符"b"
"A"、"B"、"expr" - 可以在后面使用Ref("A")引用的语法部分的名称
And - 消耗表达式序列
Or - 消耗任意一个表达式
Ref - 引用其他表达式(允许递归)
Optional - 使表达式变为非必需的

我猜这不是描述这个语法最简洁的方式。如何使它更紧凑?

相关链接:


编辑:

Filip的BNF 答案 可以用我的语法写成:

	"expr": Or(Ref("A"), Ref("B")),
	"A":    Or(And(a, Ref("B")), a),
	"B":    Or(And(b, Ref("A")), b)
英文:

I am working on a toy parser in golang just to learn the language. I added a test case with grammar covering the following cases:

Valid:
a, ab, aba, ababababababa, ababababab
b, ba, bab, babababababab, bababababa

Invalid:
abb, baa

a is always followed by b and vice versa.

Now the grammar in my parser looks like that (I omit the surrounding code for sake of brevity):

	"expr": Or(Ref("A"), Ref("B")),
	"A": And(
		a,
		Optional(
			And(
				b,
				Optional(Ref("A"))))),
	"B": And(
		b,
		Optional(Ref("A")))

Where

a - exact match for "a" character
b - exact match for "b" character
"A", "B", "expr" - names of the parts of the grammar that can be referred
                   later with Ref("A")
And      - consume sequence of expressions
Or       - consume any of the expressions
Ref      - refer to other expression (allows recursion)
Optional - make the expression non-obligatory

I guess it is not the most succinct way to describe this grammar. How to make it more compact?

Related:


EDIT:

The BNF answer from Filip can be written in my syntax as:

	"expr": Or(Ref("A"), Ref("B")),
	"A":    Or(And(a, Ref("B")), a),
	"B":    Or(And(b, Ref("A")), b)

答案1

得分: 3

你提供的BNF语法如下:

expr ::= A | B
A ::= "a" B | "a"
B ::= "b" A | "b"

我认为使用你的语法可以将其翻译为:

"expr": Or(Ref("A"), Ref("B")),
"A": And(
    a,
    Optional(Ref("B"))),
"B": And(
    b,
    Optional(Ref("A")))

请注意,在检查非终结符(Ref(x))之前,检查终结符("a""b")是很重要的,否则会导致无限循环。它会一直尝试查看是否可以将另一个AB匹配到字符串的末尾,然后再匹配另一个,再匹配另一个,导致无休止的递归。

英文:

The BNF grammar you have is this:

expr ::= A | B
A ::= "a" B | "a"
B ::= "b" A | "b"

which I think translates to this using your syntax:

"expr": Or(Ref("A"), Ref("B")),
"A": And(
    a,
    Optional(Ref("B"))),
"B": And(
    b,
    Optional(Ref("A")))

Note that it is important to check terminals ("a", "b") before non-terminals (Ref(x)), or you'll get an infinite loop. It would always try to see if it could match another A or B to the end of the string, and then another, and another, causing a never ending recursion.

huangapple
  • 本文由 发表于 2016年1月23日 03:32:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/34954520.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定