英文:
Antlr4/Java : how to make a semantic predicate that skips a token (lexer) according to the parser rule that calls it
问题
我想要使用我的词法规则
```antlr4
NEW_LINE : '\n' -> skip;
像普通规则一样。理解这一点:我希望忽略换行符,除非它们是必需的,以创建类似Python的语法。例如,在这里,换行符被忽略:
cook("banana",
"potatoe)
但是在新语句中不可能跳过换行符,就像这样:
cook("banana", "potatoe") varA = 12.4
在cook()
和赋值之间必须有一个换行符。这就是为什么有时我必须跳过换行符,但仍然需要在其他地方强制它们的原因。
这就是我想到的:
start
: line*
;
line
: line_expression (NEW_LINE | EOF)
;
line_expression
: expression
| assignment
;
expression
: Decimal
| Integer
| Text
| Boolean
;
并创建一个语义谓词,如“如果调用解析器规则不是line,则skip();
它。”
现在我只需要帮助来实现这一点。
我希望我表达清楚!
PS:如果不清楚,我正在使用Java作为主要语言。
英文:
I would like to use my lexer rule
NEW_LINE : '\n' -> skip;
Like a normal rule. Understanding by this: I want to ignore the new lines except when they are mandatory, to create a Python similar syntax. For example, here, new lines are ignored:
cook("banana",
"potatoe)
but it is impossible to skip the new line for a new statement, like this:
cook("banana", "potatoe") varA = 12.4
, there must be a new line between cook()
and the assignment. This is why I sometimes have to skip the new lines, but still force them somewhere else.
This is why I got this idea:
start
: line*
;
line
: line_expression (NEW_LINE | EOF)
;
line_expression
: expression
| assignment
;
expression
: Decimal
| Integer
| Text
| Boolean
;
And make a semantic predicate like "if the calling parser rule is not line, skip();
it."
Now I just need help to do that.
I hope I was clear !
PS: I'm using Java as main language if that wasn't clear
答案1
得分: 1
您可以跟踪遇到的(
的数量(如果遇到)
则减少此数量)。然后,只有在此数量等于零时才创建NL
令牌。
这里是一个快速演示:
语法规则 T;
@lexer::members {
int parensLevel = 0;
}
解析
: .*? EOF
;
OPAR : '(' {parensLevel++;};
CPAR : ')' {parensLevel--;};
NUMBER : [0-9]+ ('.' [0-9]+)?;
STRING : '"' ~'"'* '"';
ASSIGN : '=';
COMMA : ',';
ID : [a-zA-Z]+;
SPACES : [ \t]+ -> skip;
NL : {parensLevel == 0}? [\r\n]+;
NL_SKIP : [\r\n]+ -> skip;
如果您向词法分析器提供以下输入:
cook("banana",
"potatoe")
varA = 12.4
将创建以下标记:
ID `cook`
'(' `(`
STRING `"banana"`
',' `,`
STRING `"potatoe"`
')' `)`
NL `\n`
ID `varA`
'=' `=`
NUMBER `12.4`
正如您所看到的,括号内的NL
被跳过,而在)
之后的NL
未被跳过。
英文:
You could keep track of the number of (
you encounter (and decrease this numbers if you encounter a )
). Then you only create NL
tokens if this number is equal to zero.
Here's a quick demo:
grammar T;
@lexer::members {
int parensLevel = 0;
}
parse
: .*? EOF
;
OPAR : '(' {parensLevel++;};
CPAR : ')' {parensLevel--;};
NUMBER : [0-9]+ ( '.' [0-9]+)?;
STRING : '"' ~'"'* '"';
ASSIGN : '=';
COMMA : ',';
ID : [a-zA-Z]+;
SPACES : [ \t]+ -> skip;
NL : {parensLevel == 0}? [\r\n]+;
NL_SKIP : [\r\n]+ -> skip;
If you feed the lexer the following input:
cook("banana",
"potatoe")
varA = 12.4
the following tokens will be created:
ID `cook`
'(' `(`
STRING `"banana"`
',' `,`
STRING `"potatoe"`
')' `)`
NL `\n`
ID `varA`
'=' `=`
NUMBER `12.4`
As you can see, the NL
inside the parens is skipped, while the one after the )
is not.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论