2020年4月10日 02:51:17go评论98阅读模式

英文:

In java using ANTLR4, check valid expression,argument type

问题

// Lexer: FunctionValidateLexer.g4

lexer grammar FunctionValidateLexer;

NAME: [A-Za-z0-9."`~!@#+%_\-]+;
PERCENT:'%';
ASTERISK:'*';
OPENSQBRACKET:'[';
CLOSEDSQBRACKET:']';
AMPERSAND:'&';
CAP:'^';
DOT: '.';
L_BRACKET: '(';
R_BRACKET: ')';
HYPHEN:'-';
UNDERSCORE:'_';
DOLLAR:'$';
PLUS:'+';
WS : [ \t\r\n]+ -> skip;

// Define a lexer rule for single and double quoted strings
SINGLE_QUOTED_STRING: '\'' (~['\r\n\\] | '\\\'')* '\'';
DOUBLE_QUOTED_STRING: '"' (~["\r\n\\] | '\\"')* '"';

// Define a lexer rule for handling commas and parentheses within quoted strings
QUOTED_CONTENT: (SINGLE_QUOTED_STRING | DOUBLE_QUOTED_STRING);

// Define a lexer rule for commas and parentheses that are not within quoted strings
COMMA: ',' -> pushMode(InsideComma);
OPEN_PAREN: '(' -> pushMode(InsideParen);
CLOSE_PAREN: ')' -> popMode();

mode InsideComma;
    NON_COMMA: ~[,\r\n]+ -> popMode;

mode InsideParen;
    NON_PAREN: ~[\(\r\n]+ -> popMode;

// Parser: FunctionValidateParser.g4

parser grammar FunctionValidateParser;
options { tokenVocab=FunctionValidateLexer; }

functions : function* EOF;
function : NAME '(' (argument (COMMA argument)*)? ')';
argument: (NAME | function | QUOTED_CONTENT);

In the lexer rules above, I've added definitions for single and double quoted strings (SINGLE_QUOTED_STRING and DOUBLE_QUOTED_STRING). These rules will capture strings between single quotes or double quotes while ignoring escaped quotes within the strings.

I've also introduced a new lexer rule called QUOTED_CONTENT, which matches the content within single or double quoted strings, allowing commas and parentheses to appear there.

In the lexer rules for COMMA and OPEN_PAREN, I've added modes (InsideComma and InsideParen) to handle situations where commas and parentheses are encountered outside of quoted strings.

The InsideComma and InsideParen modes are used to capture characters that are not commas or parentheses within the respective contexts.

The NON_COMMA and NON_PAREN rules inside these modes capture characters that are not commas or parentheses, allowing for the lexer to return to the regular mode when a comma or parenthesis is closed within a quoted string.

These changes in the lexer allow for handling commas and parentheses within quoted strings differently from those outside of quoted strings, addressing your requirement to consider them only when they appear between single or double quotes.

The parser grammar remains largely unchanged, but now it includes the QUOTED_CONTENT lexer token as a valid argument for functions. This ensures that quoted content, including commas and parentheses, is correctly parsed as arguments when they are within quotes.

英文:

I am new to antlr4, using antl4 and java how we can write parsing nested expression. check the argument whether it is int, string, decimal, or boolean and the expression is a valid expression.

Example:

1. toString(&quot;test&quot;)
2. mul(toNumber(&quot;1.6&quot;),add(3.14,1.5))
3. getRandomNumber()
4. split(split(&quot;1/2,3/4,4/5&quot;,&quot;,&quot;),&quot;/&quot;)
5. append(&quot;[1,2,3&quot;,&quot;]&quot;)

Below is the expression names for checking whether the expression is valid or not.

Map&lt;String,String&gt; map=new HashMap&lt;&gt;();
map.put(&quot;toString&quot;,&quot;String&quot;);
map.put(&quot;mul&quot;,&quot;decimal,decimal&quot;);
map.put(&quot;toNumber&quot;,&quot;String&quot;);
map.put(&quot;add&quot;,&quot;decimal,decimal&quot;);
map.put(&quot;generateRandomNumber&quot;,&quot;&quot;);

So, by using the above map we have to check whether the name is correct and the return type is correct in case of nested expression, as it will be an argument for another expression. And if expression name is correct we have to check is the arguments are correct or not. I have written the lexer and parser it is working but for some inputs like [,],",' and comma like these inputs it is failing as in expression we are having comma(,) for separation of argument. Below are the lexer and parser.

Lexer:
FunctionValidateLexer.g4

lexer grammar FunctionValidateLexer;
NAME: [A-Za-z0-9.&quot;`~!@#+%_-]+;
PERCENT:&#39;%&#39;;
ASTERICK:&#39;*&#39;;
OPENSQBRKET:&#39;\\[&#39;;
CLOSEDSQBRKET:&#39;\\]&#39;;
AMPERSAND:&#39;&amp;&#39;;
CAP:&#39;^&#39;;
DOT: &#39;.&#39;;
COMMA: &#39;,&#39;;
L_BRACKET: &#39;(&#39;;
R_BRACKET: &#39;)&#39;;
HIPHEN:&#39;-&#39;;
UNDERSCORE:&#39;_&#39;;
DOLLAR:&#39;$&#39;;
PLUS:&#39;+&#39;;
WS : [ \t\r\n]+ -&gt; skip;

parser:
FunctionValidateParser.g4

parser grammar FunctionValidateParser;
options { tokenVocab=FunctionValidateLexer; }
functions : function* EOF;
function : NAME &#39;(&#39; (argument (COMMA argument)*)? &#39;)&#39;;
argument: (NAME | function );

I have written visitor pattern for expression name and argument validation. But I facing problem in defining lexer and parser for accepting required arguments.

How can I change the lexer and parser to parse to accept all characters except comma(,) , round brackets( ( ). The comma and round bracket should be considered as an argument whenever they are between two double or single quotes( like ',' or "," or "(" or ")").

So as described above I wanted to accept all characters like ` ! @ # $ % ^ & * [ ] / ? < > : ; " " \ | . + - } { . But as round brackets and comma are part of expression definition, they have to be considered only when they are between single or double quotes otherwise throw error. How can modify my lexer and parser for accepting the above requirement.

答案1

得分: 1

I don't understand why you're not matching strings: "... ". This makes no sense to me. The following grammar parses all of your example input:

parse     : function* EOF;
function  : ID '(' expr_list? ')';
expr_list : expr (',' expr)*;
expr      : function | STRING | NUMBER | ID;

STRING    : '"' ~'"'* '"';
NUMBER    : [0-9]+ ('.' [0-9]+)?;
ID        : [a-zA-Z_] [a-zA-Z_0-9]*;
SPACES    : [ \t\r\n]+ -> skip;

[![enter image description here][1]][1]


<details>
<summary>英文:</summary>

I don&#39;t understand why you&#39;re not matching strings: `&quot; ... &quot;`. This makes no sense to me. The following grammar parses all of your example input:

parse : function* EOF;
function : ID '(' expr_list? ')';
expr_list : expr (',' expr)*;
expr : function | STRING | NUMBER | ID;

STRING : '"' ~'"'* '"';
NUMBER : [0-9]+ ('.' [0-9]+)?;
ID : [a-zA-Z_] [a-zA-Z_0-9]*;
SPACES : [ \t\r\n]+ -> skip;


[![enter image description here][1]][1]


  [1]: https://i.stack.imgur.com/H74Sa.png

</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Java中使用ANTLR4，检查有效表达式，参数类型。

问题

答案1

Why isn't JTable ModelListener working and how do you save JTabel cell edits to DB?

如何避免从超类进行强制转换？

Java面向对象编程 – 在简单示例中建模的问题（多态）。组合还是继承？

如何在文件中查找特定系列的输入并打印结果。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论