如何制作一个ANTLR语法,可以匹配分隔符内外的字符串?

huangapple go评论62阅读模式
英文:

How to make an ANTLR grammar that matches strings both inside and outside a delimiter?

问题

这个ANTLR4的语法应该将文档分成两种子串类型:wiki和nowiki。

grammar NoWikiText;

nowiki: '<nowiki>' ~'</nowiki>'* '</nowiki>';
wiki: ~'<nowiki>'+;
document: (wiki | nowiki)*;

这是输入:

<nowiki>2</nowiki>4<nowiki></nowiki>

我得到两个nowiki匹配。但应该匹配wiki的文本"4"被忽略了。为什么?

编辑:

这似乎有效:

grammar NoWikiText;

P1: '<nowiki>';
P2: '</nowiki>';
NP: .;

nowiki: P1 NP* P2;
wiki: NP+;
document: (wiki | nowiki)*;
英文:

This grammar for ANTLR4 should break a document up into two types of substring: wiki and nowiki.

grammar NoWikiText;

nowiki: '<nowiki>' ~'</nowiki>'* '</nowiki>';
wiki: ~'<nowiki>'+;
document: (wiki | nowiki)*;

Here's the input:

<nowiki>2</nowiki>4<nowiki></nowiki>

I get two matches for nowiki. But the text "4", which should match wiki, is ignored. Why?

EDIT:

This seems to work:

grammar NoWikiText;

P1: '<nowiki>';
P2: '</nowiki>';
NP: .;

nowiki: P1 NP* P2;
wiki: NP+;
document: (wiki | nowiki)*;

答案1

得分: 1

在你发布的语法中,只会创建2个标记:<nowiki></nowiki>否定字符的工作方式与你的期望不同~'</nowiki>' 的意思是:“匹配除了 </nowiki> 之外的任何标记”(这将匹配标记 <nowiki>)。因此,对于你的输入 <nowiki>2</nowiki>4<nowiki></nowiki>24 不被识别为有效的标记。

英文:

In the grammar you posted, only 2 tokens will be created: <nowiki> and </nowiki>. The negations char works differently than you expect: ~'</nowiki>' means: "match any token other than </nowiki>" (so that would match the token <nowiki>). So for your input <nowiki>2</nowiki>4<nowiki></nowiki>, the 2 and 4 are not recognized as valid tokens.

huangapple
  • 本文由 发表于 2023年6月5日 10:41:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76403233.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定