2020年5月29日 15:58:48go评论65阅读模式

英文:

Antlr grammar confusion: Not Reporting Errors when clear error is given

问题

以下是翻译好的内容：

grammar FilterExpression;

// 词法规则

AND : 'AND';
OR  : 'OR';
NOT : 'NOT';

GT : '>';
GE : '>=';
LT : '<';
LE : '<=';
EQ : '=';

DECIMAL    : '-'?[0-9]+('.'[0-9]+)?;
KEY       : ~[ \t\r\n\"~=<>:(),]+ ;
QUOTED_WORD: ["] ('\\" | ~["])* ["];
NEWLINE    : '\r'? '\n';
WS         : [ \t\r\n]+ -> skip;

StringFilter   : KEY ':' QUOTED_WORD;
NumericalFilter  : KEY (GT | GE | LT | LE | EQ) DECIMAL;

condition      : StringFilter                                # stringCondition
               | NumericalFilter                             # numericalCondition
               | StringFilter op=(AND|OR) StringFilter       # combinedStringCondition
               | NumericalFilter op=(AND|OR) NumericalFilter # combinedNumericalCondition
               | condition AND condition                     # combinedCondition
               | '(' condition ')'                           # parens;

public class ParserTest {
  private class BailLexer extends FilterExpressionLexer {
    public BailLexer(CharStream input) {
      super(input);
    }
    public void recover(LexerNoViableAltException e) {
      throw new RuntimeException(e);
    }
  }

  private FilterExpressionParser createParser(String filterString) {
    FilterExpressionLexer lexer = new BailLexer(CharStreams.fromString(filterString));
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    FilterExpressionParser parser = new FilterExpressionParser(tokens);

    parser.setErrorHandler(new BailErrorStrategy());
    parser.addErrorListener(new ANTLRErrorListener() {
      @Override
      public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
        System.out.print("here1");
      }

      @Override
      public void reportAmbiguity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs) {
        System.out.print("here2");
      }

      @Override
      public void reportAttemptingFullContext(Parser recognizer, DFA dfa, int startIndex, int stopIndex, BitSet conflictingAlts, ATNConfigSet configs) {
        System.out.print("here3");
      }

      @Override
      public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction, ATNConfigSet configs) {
        System.out.print("here4");
      }
    });

    return parser;
  }

  @Test
  public void test() {
    FilterExpressionParser parser = createParser("(brand:\"apple\" AND t>3) 1>3");
    parser.condition();
  }
}

英文:

I'm trying to design a simple query language as following

grammar FilterExpression;
// Lexer rules
AND : &#39;AND&#39; ;
OR  : &#39;OR&#39; ;
NOT : &#39;NOT&#39;;
GT : &#39;&gt;&#39; ;
GE : &#39;&gt;=&#39; ;
LT : &#39;&lt;&#39; ;
LE : &#39;&lt;=&#39; ;
EQ : &#39;=&#39; ;
DECIMAL    : &#39;-&#39;?[0-9]+(&#39;.&#39;[0-9]+)? ;
KEY       : ~[ \t\r\n\\&quot;~=&lt;&gt;:(),]+ ;
QUOTED_WORD: [&quot;] (&#39;\\&quot;&#39; | ~[&quot;])* [&quot;] ;
NEWLINE    : &#39;\r&#39;? &#39;\n&#39;;
WS         : [ \t\r\n]+ -&gt; skip ;
StringFilter   : KEY &#39;:&#39; QUOTED_WORD;
NumericalFilter  : KEY (GT | GE | LT | LE | EQ) DECIMAL;
condition      : StringFilter                                # stringCondition
| NumericalFilter                             # numericalCondition
| StringFilter op=(AND|OR) StringFilter       # combinedStringCondition
| NumericalFilter op=(AND|OR) NumericalFilter # combinedNumericalCondition
| condition AND condition                     # combinedCondition
| &#39;(&#39; condition &#39;)&#39;                           # parens
;

I added a few tests and would like to verify if they work as expected. To my surprise, some cases which should be clearly wrong passed

For instance when I type

(brand:&quot;apple&quot; AND t&gt;3) 1&gt;3

where the 1>3 is deliberately put as an error. However it seems Antlr is still happily generating a tree which looks like:

Is it because my grammar has some problems I didn't realize?

I also tried in IntelliJ plugin (because I thought grun might not behaving as expected) but it give

Test code I'm using. Note I also tried to use BailErrorStrategy but these doesn't seem to help

public class ParserTest {
private class BailLexer extends FilterExpressionLexer {
public BailLexer(CharStream input) {
super(input);
}
public void recover(LexerNoViableAltException e) {
throw new RuntimeException(e);
}
}
private FilterExpressionParser createParser(String filterString) {
//FilterExpressionLexer lexer = new FilterExpressionLexer(CharStreams.fromString(filterString));
FilterExpressionLexer lexer = new BailLexer(CharStreams.fromString(filterString));
CommonTokenStream tokens = new CommonTokenStream(lexer);
FilterExpressionParser parser = new FilterExpressionParser(tokens);
parser.setErrorHandler(new BailErrorStrategy());
parser.addErrorListener(new ANTLRErrorListener() {
@Override
public void syntaxError(Recognizer&lt;?, ?&gt; recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
System.out.print(&quot;here1&quot;);
}
@Override
public void reportAmbiguity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs) {
System.out.print(&quot;here2&quot;);
}
@Override
public void reportAttemptingFullContext(Parser recognizer, DFA dfa, int startIndex, int stopIndex, BitSet conflictingAlts, ATNConfigSet configs) {
System.out.print(&quot;here3&quot;);
}
@Override
public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction, ATNConfigSet configs) {
System.out.print(&quot;here4&quot;);
}
});
return parser;
}
@Test
public void test() {
FilterExpressionParser parser = createParser(&quot;(brand:\&quot;apple\&quot; AND t&gt;3) 1&gt;3&quot;);
parser.condition();
}
}

答案1

得分: 1

看起来我终于找到了答案。

原因在于语法中我没有提供 EOF（文件结束符）。显然在 ANTLR 中解析前缀 os 语法是可以的。这就是为什么测试字符串的其余部分

(brand:"apple" AND t>3) 1>3，即 1>3 是被允许的。

在这里可以看到讨论：https://github.com/antlr/antlr4/issues/351

然后我稍微修改了语法，在语法的末尾添加了一个 EOF，即 condition EOF，这样一切都正常工作了。

英文:

Looks like I found the answer finally.

The reason is in the grammar I didn't provide an EOF. And obviously in ANTLR it's perfectly fine to parse the prefix os syntax. that's why the rest of the test string

(brand:"apple" AND t>3) 1>3 i.e. 1>3 is allowed.

See discussion here: https://github.com/antlr/antlr4/issues/351

Then I changed the grammar a little to add an EOF at the end of the syntax condition EOF everything works

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Antlr语法混淆：在明确给出错误时不报告错误

问题

答案1

在JPA中是否有通过firstName查找客户的函数？

为什么 Java 无法解密 CryptoJS 加密数据？

Java从同一个字符串的每个单词中精确地提取3个字母substring。

JNI 8 C++：线程附加与分离以及异步回调

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论