英文:
Java grammar production rule syntax of Oracle manual
问题
我对Oracle Java规范手册中提到的Java程序语法规则有一些疑问。以下是该规则的近似表示,以符合SO的HTML限制。
-
对于ArrayInitializer非终结符,第一个花括号
{
是指终结符花括号,还是我上面提到的句法含义? -
同样,对于VariableInitializerList非终结符,我知道<code><em>{</em>,<em> VariableInitializer}</em></code>意味着与正则表达式
(a,b)*
等价,但这种语法是否也会接受一些实际上不符合标准的字符串? -
我还想确认第一个产生式中的方括号是否表示正则表达式或简单终结符。
我觉得这个语法规范令人困惑。您能帮助我理解吗?
英文:
I have a syntatic doubt about a Java program grammar rule mentioned in Oracle Java specification manual. Here is an approximation of that rule, to the extent that SO's HTML restrictions allow it.
<pre><em>ArrayInitializer:
  </em>{<em> [VariableInitializerList] [,] </em>}<em>
VariableInitializerList:
  VariableInitializer {, VariableInitializer}
</em></pre>
It is given in section 2.4 of Java manual. [x]
denotes zero or one occurrences of x
and {x}
denotes zero or more.
However, I have the following doubts,
- For ArrayInitializer non-terminal, the first curly brace
{
denotes a terminal curly brace or the syntactic meaning I mentioned above. - Also, for VariableInitializerList non-terminal, I know that <code><em>{</em>,<em> VariableInitializer}</em></code> means something equivalent to regex
(a,b)*
, but this kind of grammar will also accept some other string which does not actually fit the criterion, won't it? - I also want to confirm if the square brackets in the first production denote the regex or simple terminals.
I find this grammar specification confusing. Can you help me understand it?
答案1
得分: 1
以下是翻译好的部分:
"The Java specification uses font styles to distinguish between literal characters, as found in the input, and grammatical symbols (non-terminals and grammar operators). Literals are shown in fixed width
, while grammar symbols are shown in italics."
"Java规范使用字体样式来区分输入中的文字字符和语法符号(非终结符和语法操作符)。文字字符显示为固定宽度
,而语法符号显示为斜体。"
"That's a pretty subtle distinction, particularly for certain punctuation symbols [Note 1]. Fortunately, the only punctuation used as grammar operators are brackets and braces, and it's not that hard to see whether a brace is slanting (italic) or upright. The brace in ArrayInitializer is upright, and the bracket is slanting, as is the brace in VariableInitializerList. So the brace in ArrayInitializer is a literal character. The brackets in that production indicate that the enclosed grammar symbols are optional, and the braces in VariableInitializerList indicate that the enclosed symbols can be repeated any number of times, including zero. (That's effectively the Kleene *
-operator, which, as you say, is used in regular expressions.)"
"这是一个相当微妙的区分,尤其是对于某些标点符号[注1]。幸运的是,作为语法操作符使用的唯一标点符号是括号和大括号,看到括号是斜的(斜体)还是竖的并不那么困难。ArrayInitializer中的大括号是竖着的,方括号是斜的,VariableInitializerList中的大括号也是斜的。因此,ArrayInitializer中的大括号是文字字符。该产生式中的方括号表示所包含的语法符号是可选的,VariableInitializerList中的大括号表示所包含的符号可以重复任意次,包括零次。(这实际上是Kleene*
运算符,正如您所说,它在正则表达式中使用。)"
"I trust that answers your questions (1) and (3). I don't really understand your question 2. Note that the comma in VariableInitializer { , VariableInitializer }
is a literal character (it's not in italic) so what's being described is a non-empty comma-separated list of initializers. I don't know why you think that differs from other Kleene star operators."
"我相信这回答了您的问题(1)和(3)。我不太理解您的问题2。请注意,VariableInitializer { , VariableInitializer }
中的逗号是文字字符(不是斜体),因此描述的是非空逗号分隔的初始化列表。我不知道您为什么认为这与其他Kleene星运算符不同。"
Notes
"注1:并不帮助的是,CSS的一个错误影响了第2.4节中的示例,这些示例据称说明了语法。CSS强制将“注释”中的所有内容都变为斜体,从而隐藏了语法操作符和文字字符之间的区别。"
英文:
The Java specification uses font styles to distinguish between literal characters, as found in the input, and grammatical symbols (non-terminals and grammar operators). Literals are shown in <code>fixed width</code>, while grammar symbols are shown in <em>italics</em>.
That's a pretty subtle distinction, particularly for certain punctuation symbols [Note 1]. Fortunately, the only punctuation used as grammar operators are brackets and braces, and it's not that hard to see whether a brace is slanting (italic) or upright. The brace in ArrayInitializer is upright, and the bracket is slanting, as is the brace in VariableInitializerList. So the brace in ArrayInitializer is a literal character. The brackets in that production indicate that the enclosed grammar symbols are optional, and the braces in VariableInitializerList indicate that the enclosed symbols can be repeated any number of times, including zero. (That's effectively the Kleene *
-operator, which, as you say, is used in regular expressions.)
I trust that answers your questions (1) and (3). I don't really understand your question 2. Note that the comma in <code><em>VariableInitializer { </emem>,<em> VariableInitializer }</em><code> is a literal character (it's not in italic) so what's being described is a non-empty comma-separated list of initializers. I don't know why you think that differs from other Kleene star operators.
Notes
- It doesn't help that a CSS bug affects the examples in section 2.4, which supposedly illustrate the grammar. The CSS forces everything in a "note" to be italicized, thereby hiding the distinction between grammar operators and literal characters.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论