英文:
In java using ANTLR4, check valid expression,argument type
问题
// Lexer: FunctionValidateLexer.g4
lexer grammar FunctionValidateLexer;
NAME: [A-Za-z0-9."`~!@#+%_\-]+;
PERCENT:'%';
ASTERISK:'*';
OPENSQBRACKET:'[';
CLOSEDSQBRACKET:']';
AMPERSAND:'&';
CAP:'^';
DOT: '.';
L_BRACKET: '(';
R_BRACKET: ')';
HYPHEN:'-';
UNDERSCORE:'_';
DOLLAR:'$';
PLUS:'+';
WS : [ \t\r\n]+ -> skip;
// Define a lexer rule for single and double quoted strings
SINGLE_QUOTED_STRING: '\'' (~['\r\n\\] | '\\\'')* '\'';
DOUBLE_QUOTED_STRING: '"' (~["\r\n\\] | '\\"')* '"';
// Define a lexer rule for handling commas and parentheses within quoted strings
QUOTED_CONTENT: (SINGLE_QUOTED_STRING | DOUBLE_QUOTED_STRING);
// Define a lexer rule for commas and parentheses that are not within quoted strings
COMMA: ',' -> pushMode(InsideComma);
OPEN_PAREN: '(' -> pushMode(InsideParen);
CLOSE_PAREN: ')' -> popMode();
mode InsideComma;
NON_COMMA: ~[,\r\n]+ -> popMode;
mode InsideParen;
NON_PAREN: ~[\(\r\n]+ -> popMode;
// Parser: FunctionValidateParser.g4
parser grammar FunctionValidateParser;
options { tokenVocab=FunctionValidateLexer; }
functions : function* EOF;
function : NAME '(' (argument (COMMA argument)*)? ')';
argument: (NAME | function | QUOTED_CONTENT);
In the lexer rules above, I've added definitions for single and double quoted strings (SINGLE_QUOTED_STRING
and DOUBLE_QUOTED_STRING
). These rules will capture strings between single quotes or double quotes while ignoring escaped quotes within the strings.
I've also introduced a new lexer rule called QUOTED_CONTENT
, which matches the content within single or double quoted strings, allowing commas and parentheses to appear there.
In the lexer rules for COMMA
and OPEN_PAREN
, I've added modes (InsideComma
and InsideParen
) to handle situations where commas and parentheses are encountered outside of quoted strings.
The InsideComma
and InsideParen
modes are used to capture characters that are not commas or parentheses within the respective contexts.
The NON_COMMA
and NON_PAREN
rules inside these modes capture characters that are not commas or parentheses, allowing for the lexer to return to the regular mode when a comma or parenthesis is closed within a quoted string.
These changes in the lexer allow for handling commas and parentheses within quoted strings differently from those outside of quoted strings, addressing your requirement to consider them only when they appear between single or double quotes.
The parser grammar remains largely unchanged, but now it includes the QUOTED_CONTENT
lexer token as a valid argument for functions. This ensures that quoted content, including commas and parentheses, is correctly parsed as arguments when they are within quotes.
英文:
I am new to antlr4, using antl4 and java how we can write parsing nested expression. check the argument whether it is int, string, decimal, or boolean and the expression is a valid expression.
Example:
1. toString("test")
2. mul(toNumber("1.6"),add(3.14,1.5))
3. getRandomNumber()
4. split(split("1/2,3/4,4/5",","),"/")
5. append("[1,2,3","]")
Below is the expression names for checking whether the expression is valid or not.
Map<String,String> map=new HashMap<>();
map.put("toString","String");
map.put("mul","decimal,decimal");
map.put("toNumber","String");
map.put("add","decimal,decimal");
map.put("generateRandomNumber","");
So, by using the above map we have to check whether the name is correct and the return type is correct in case of nested expression, as it will be an argument for another expression. And if expression name is correct we have to check is the arguments are correct or not. I have written the lexer and parser it is working but for some inputs like [,],",' and comma like these inputs it is failing as in expression we are having comma(,) for separation of argument. Below are the lexer and parser.
Lexer:
FunctionValidateLexer.g4
lexer grammar FunctionValidateLexer;
NAME: [A-Za-z0-9."`~!@#+%_-]+;
PERCENT:'%';
ASTERICK:'*';
OPENSQBRKET:'\\[';
CLOSEDSQBRKET:'\\]';
AMPERSAND:'&';
CAP:'^';
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
HIPHEN:'-';
UNDERSCORE:'_';
DOLLAR:'$';
PLUS:'+';
WS : [ \t\r\n]+ -> skip;
parser:
FunctionValidateParser.g4
parser grammar FunctionValidateParser;
options { tokenVocab=FunctionValidateLexer; }
functions : function* EOF;
function : NAME '(' (argument (COMMA argument)*)? ')';
argument: (NAME | function );
I have written visitor pattern for expression name and argument validation. But I facing problem in defining lexer and parser for accepting required arguments.
How can I change the lexer and parser to parse to accept all characters except comma(,)
, round brackets( ( )
. The comma and round bracket should be considered as an argument whenever they are between two double or single quotes( like ',' or "," or "(" or ")").
So as described above I wanted to accept all characters like ` ! @ # $ % ^ & * [ ] / ? < > : ; " " \ | . + - } { . But as round brackets and comma are part of expression definition, they have to be considered only when they are between single or double quotes otherwise throw error. How can modify my lexer and parser for accepting the above requirement.
答案1
得分: 1
I don't understand why you're not matching strings: "... "
. This makes no sense to me. The following grammar parses all of your example input:
parse : function* EOF;
function : ID '(' expr_list? ')';
expr_list : expr (',' expr)*;
expr : function | STRING | NUMBER | ID;
STRING : '"' ~'"'* '"';
NUMBER : [0-9]+ ('.' [0-9]+)?;
ID : [a-zA-Z_] [a-zA-Z_0-9]*;
SPACES : [ \t\r\n]+ -> skip;
[![enter image description here][1]][1]
<details>
<summary>英文:</summary>
I don't understand why you're not matching strings: `" ... "`. This makes no sense to me. The following grammar parses all of your example input:
parse : function* EOF;
function : ID '(' expr_list? ')';
expr_list : expr (',' expr)*;
expr : function | STRING | NUMBER | ID;
STRING : '"' ~'"'* '"';
NUMBER : [0-9]+ ('.' [0-9]+)?;
ID : [a-zA-Z_] [a-zA-Z_0-9]*;
SPACES : [ \t\r\n]+ -> skip;
[![enter image description here][1]][1]
[1]: https://i.stack.imgur.com/H74Sa.png
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论