英文:
Antlr grammar confusion: Not Reporting Errors when clear error is given
问题
以下是翻译好的内容:
grammar FilterExpression;
// 词法规则
AND : 'AND';
OR : 'OR';
NOT : 'NOT';
GT : '>';
GE : '>=';
LT : '<';
LE : '<=';
EQ : '=';
DECIMAL : '-'?[0-9]+('.'[0-9]+)?;
KEY : ~[ \t\r\n\"~=<>:(),]+ ;
QUOTED_WORD: ["] ('\\" | ~["])* ["];
NEWLINE : '\r'? '\n';
WS : [ \t\r\n]+ -> skip;
StringFilter : KEY ':' QUOTED_WORD;
NumericalFilter : KEY (GT | GE | LT | LE | EQ) DECIMAL;
condition : StringFilter # stringCondition
| NumericalFilter # numericalCondition
| StringFilter op=(AND|OR) StringFilter # combinedStringCondition
| NumericalFilter op=(AND|OR) NumericalFilter # combinedNumericalCondition
| condition AND condition # combinedCondition
| '(' condition ')' # parens;
public class ParserTest {
private class BailLexer extends FilterExpressionLexer {
public BailLexer(CharStream input) {
super(input);
}
public void recover(LexerNoViableAltException e) {
throw new RuntimeException(e);
}
}
private FilterExpressionParser createParser(String filterString) {
FilterExpressionLexer lexer = new BailLexer(CharStreams.fromString(filterString));
CommonTokenStream tokens = new CommonTokenStream(lexer);
FilterExpressionParser parser = new FilterExpressionParser(tokens);
parser.setErrorHandler(new BailErrorStrategy());
parser.addErrorListener(new ANTLRErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
System.out.print("here1");
}
@Override
public void reportAmbiguity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs) {
System.out.print("here2");
}
@Override
public void reportAttemptingFullContext(Parser recognizer, DFA dfa, int startIndex, int stopIndex, BitSet conflictingAlts, ATNConfigSet configs) {
System.out.print("here3");
}
@Override
public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction, ATNConfigSet configs) {
System.out.print("here4");
}
});
return parser;
}
@Test
public void test() {
FilterExpressionParser parser = createParser("(brand:\"apple\" AND t>3) 1>3");
parser.condition();
}
}
英文:
I'm trying to design a simple query language as following
grammar FilterExpression;
// Lexer rules
AND : 'AND' ;
OR : 'OR' ;
NOT : 'NOT';
GT : '>' ;
GE : '>=' ;
LT : '<' ;
LE : '<=' ;
EQ : '=' ;
DECIMAL : '-'?[0-9]+('.'[0-9]+)? ;
KEY : ~[ \t\r\n\\"~=<>:(),]+ ;
QUOTED_WORD: ["] ('\\"' | ~["])* ["] ;
NEWLINE : '\r'? '\n';
WS : [ \t\r\n]+ -> skip ;
StringFilter : KEY ':' QUOTED_WORD;
NumericalFilter : KEY (GT | GE | LT | LE | EQ) DECIMAL;
condition : StringFilter # stringCondition
| NumericalFilter # numericalCondition
| StringFilter op=(AND|OR) StringFilter # combinedStringCondition
| NumericalFilter op=(AND|OR) NumericalFilter # combinedNumericalCondition
| condition AND condition # combinedCondition
| '(' condition ')' # parens
;
I added a few tests and would like to verify if they work as expected. To my surprise, some cases which should be clearly wrong passed
For instance when I type
(brand:"apple" AND t>3) 1>3
where the 1>3
is deliberately put as an error. However it seems Antlr is still happily generating a tree which looks like:
Is it because my grammar has some problems I didn't realize?
I also tried in IntelliJ plugin (because I thought grun might not behaving as expected) but it give
Test code I'm using. Note I also tried to use BailErrorStrategy but these doesn't seem to help
public class ParserTest {
private class BailLexer extends FilterExpressionLexer {
public BailLexer(CharStream input) {
super(input);
}
public void recover(LexerNoViableAltException e) {
throw new RuntimeException(e);
}
}
private FilterExpressionParser createParser(String filterString) {
//FilterExpressionLexer lexer = new FilterExpressionLexer(CharStreams.fromString(filterString));
FilterExpressionLexer lexer = new BailLexer(CharStreams.fromString(filterString));
CommonTokenStream tokens = new CommonTokenStream(lexer);
FilterExpressionParser parser = new FilterExpressionParser(tokens);
parser.setErrorHandler(new BailErrorStrategy());
parser.addErrorListener(new ANTLRErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
System.out.print("here1");
}
@Override
public void reportAmbiguity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs) {
System.out.print("here2");
}
@Override
public void reportAttemptingFullContext(Parser recognizer, DFA dfa, int startIndex, int stopIndex, BitSet conflictingAlts, ATNConfigSet configs) {
System.out.print("here3");
}
@Override
public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction, ATNConfigSet configs) {
System.out.print("here4");
}
});
return parser;
}
@Test
public void test() {
FilterExpressionParser parser = createParser("(brand:\"apple\" AND t>3) 1>3");
parser.condition();
}
}
答案1
得分: 1
看起来我终于找到了答案。
原因在于语法中我没有提供 EOF(文件结束符)。显然在 ANTLR 中解析前缀 os 语法是可以的。这就是为什么测试字符串的其余部分
(brand:"apple" AND t>3) 1>3
,即 1>3
是被允许的。
在这里可以看到讨论:https://github.com/antlr/antlr4/issues/351
然后我稍微修改了语法,在语法的末尾添加了一个 EOF,即 condition EOF
,这样一切都正常工作了。
英文:
Looks like I found the answer finally.
The reason is in the grammar I didn't provide an EOF. And obviously in ANTLR it's perfectly fine to parse the prefix os syntax. that's why the rest of the test string
(brand:"apple" AND t>3) 1>3
i.e. 1>3
is allowed.
See discussion here: https://github.com/antlr/antlr4/issues/351
Then I changed the grammar a little to add an EOF at the end of the syntax condition EOF
everything works
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论