This chapter describes the context-free grammars used in this specification to define the lexical and syntactic structure of a program.
本章介绍本规范中用于定义程序的词法和句法结构的上下文无关语法。
总结
为解释清楚java程序词法与句法结构而设计的规范描述语法(语言的语言)
2.1. Context-Free Grammars
2.1.上下文无关语法
A context-free grammar consists of a number of productions. Each production has an abstract symbol called a nonterminal as its left-hand side, and a sequence of one or more nonterminal and terminal symbols as its right-hand side. For each grammar, the terminal symbols are drawn from a specified alphabet.
上下文无关语法由许多产品组成。每个产品的左侧都有一个称为非终结符的抽象符号,右侧有一个或多个非终结和终结符号的序列。对于每种语法,终结符号都是从指定的字母表中提取的。
Starting from a sentence consisting of a single distinguished nonterminal, called the goal symbol, a given context-free grammar specifies a language, namely, the set of possible sequences of terminal symbols that can result from repeatedly replacing any nonterminal in the sequence with a right-hand side of a production for which the nonterminal is the left-hand side.
给上下文无关语法指定了一种由单个可拆分的被称为目标符号的非终结符组成的句子开始的语言,即终结符号的可能序列集可能由可反复替换的右侧是一个非终结符在左侧的产品的序列中的任何非终结符来组成
总结
1、上下文无关语法是由多个产品部分组成
2、一个产品部分其左侧都有一个非终结抽象符号,右侧则是由一个或者多个非终结以及终结符号组成的序列语句
3、终结符被定义在一个有限的字母表中
4、非终结符号是一个抽象的可拆分的描述符号
5、上下文无关语法定义的语言是以一个可拆分的非终结符开始的
6、最终语法所描述的终结符号可能的组合序列是通过排列组合式的反复替换的右侧是一个非终结符在左侧的产品的序列中的任何非终结符来产生
2.2. The Lexical Grammar
2.2.词法语法
A lexical grammar for the Java programming language is given in §3 (Lexical Structure). This grammar has as its terminal symbols the characters of the Unicode character set. It defines a set of productions, starting from the goal symbol Input (§3.5), that describe how sequences of Unicode characters (§3.1) are translated into a sequence of input elements (§3.2).
Java 编程语言的词汇语法在 §3 (Lexical Structure) 中给出。此语法的终结符号是 Unicode 字符集的字符。它定义了一组产品,从目标符号输入(§3.5)开始,描述如何将Unicode字符序列(§3.1)转换为输入元素序列(§3.2)。
These input elements, with white space (§3.6) and comments (§3.7) discarded, form the terminal symbols for the syntactic grammar for the Java programming language and are called tokens (§3.5). These tokens are the identifiers (§3.8), keywords (§3.9), literals (§3.10), separators (§3.11), and operators (§3.12) of the Java programming language.
这些输入元素(带有空格 (§3.6) 和注释 (§3.7 ) )是Java编程语言语法的终结符号,称为标记( §3.5 )。这些标记是 Java 编程语言的标识符 (§3.8)、关键字 (§3.9)、文字 (§3.10)、分隔符 (§3.11 ) 和运算符 (§3.12)。
总结
1、Java编程语言的词法语法,其终结符号的字母表是Unicode 字符集的字符
2、Java编程语言的词法语法定义的一组词法语法规则将Unicode字符序列转换为输入元素序列
3、这些带有空格和注释的输入元素是Java编程语言语法的终结符号,称为标记
3、标记是由Java语言的标识符、关键字、文字、分隔符以及运算符组成
2.3. The Syntactic Grammar
2.3.句法语法
The syntactic grammar for the Java programming language is given in Chapters 4, 6-10, 14, and 15. This grammar has as its terminal symbols the tokens defined by the lexical grammar. It defines a set of productions, starting from the goal symbol CompilationUnit (§7.3), that describe how sequences of tokens can form syntactically correct programs.
Java 编程语言的句法语法在第 4、6-10、14 和 15 章中给出。该语法的终结符号由词法语法定义的标记。它定义了一组产品,从目标符号 CompilationUnit (§7.3) 开始,描述标记序列如何形成语法正确的程序。
For convenience, the syntactic grammar is presented all together in Chapter 19.
为方便起见,句法语法在第19章中一起呈现。
总结
1、Java 编程语言的句法语法的终结符的字母表是词法语法定义的标记(标识符、关键字、文字、分隔符以及运算符)
2、Java 编程语言的句法语法定义了一组语法规则,描述标记序列如何形成语法正确的程序
2.4. Grammar Notation
Terminal symbols are shown in fixed width font in the productions of the lexical and syntactic grammars, and throughout this specification whenever the text is directly referring to such a terminal symbol. These are to appear in a program exactly as written.
终结符号在词汇和句法语法的制作中以 fixed width 字体显示,并且在整个规范中,每当文本直接引用此类终结符号时,终结符号都会以此字体显示。这些将完全按照编写的方式出现在程序中。
Nonterminal symbols are shown in italic type. The definition of a nonterminal is introduced by the name of the nonterminal being defined, followed by a colon. One or more alternative definitions for the nonterminal then follow on succeeding lines.
非终结符号以斜体显示。非终结的定义由所定义的非终结的名称引入,后跟冒号。然后,非终结的一个或多个替代定义将遵循后续行。
For example, the syntactic production:
IfThenStatement:
if ( Expression ) Statement
states that the nonterminal IfThenStatement represents the token if, followed by a left parenthesis token, followed by an Expression, followed by a right parenthesis token, followed by a Statement.
声明非终结 IfThenStatement 表示标记 if ,后跟左括号标记,后跟 Expression,后跟右括号标记,后跟 Statement。
The syntax {x} on the right-hand side of a production denotes zero or more occurrences of x.
作品右侧的语法 {x} 表示 x 的零次或多次出现。
For example, the syntactic production:
ArgumentList:
Argument {, Argument}
states that an ArgumentList consists of an Argument, followed by zero or more occurrences of a comma and an Argument. The result is that an ArgumentList may contain any positive number of arguments.
声明 ArgumentList 由一个 Argument 组成,后跟零次或多次出现的逗号和一个 Argument。结果是 ArgumentList 可以包含任意正数的参数。
The syntax [x] on the right-hand side of a production denotes zero or one occurrences of x. That is, x is an optional symbol. The alternative which contains the optional symbol actually defines two alternatives: one that omits the optional symbol and one that includes it.
作品右侧的语法 [x] 表示 x 的零次或一次出现。也就是说,x 是可选符号。包含可选符号的备选方案实际上定义了两个备选方案:一个省略可选符号,另一个包含可选符号。
This means that:
BreakStatement:
`break` [Identifier] `;`
is a convenient abbreviation for:
BreakStatement:
`break` `;`
`break` Identifier `;`
As another example, it means that:
BasicForStatement:
`for` `(` [ForInit] `;` [Expression] `;` [ForUpdate] `)` Statement
is a convenient abbreviation for:
BasicForStatement:
`for` `(` `;` [Expression] `;` [ForUpdate] `)` Statement
`for` `(` ForInit `;` [Expression] `;` [ForUpdate] `)` Statement
which in turn is an abbreviation for:
BasicForStatement:
`for` `(` `;` `;` [ForUpdate] `)` Statement
`for` `(` `;` Expression `;` [ForUpdate] `)` Statement
`for` `(` ForInit `;` `;` [ForUpdate] `)` Statement
`for` `(` ForInit `;` Expression `;` [ForUpdate] `)` Statement
which in turn is an abbreviation for:
BasicForStatement:
`for` `(` `;` `;` `)` Statement
`for` `(` `;` `;` ForUpdate `)` Statement
`for` `(` `;` Expression `;` `)` Statement
`for` `(` `;` Expression `;` ForUpdate `)` Statement
`for` `(` ForInit `;` `;` `)` Statement
`for` `(` ForInit `;` `;` ForUpdate `)` Statement
`for` `(` ForInit `;` Expression `;` `)` Statement
`for` `(` ForInit `;` Expression `;` ForUpdate `)` Statement
so the nonterminal BasicForStatement actually has eight alternative right-hand sides.
A very long right-hand side may be continued on a second line by clearly indenting the second line.
通过清楚地缩进第二行,可以在第二行上继续很长的右侧。
For example, the syntactic grammar contains this production:
NormalClassDeclaration:
{ClassModifier}classTypeIdentifier [TypeParameters] [ClassExtends] [ClassImplements] [ClassPermits] ClassBody
which defines one right-hand side for the nonterminal NormalClassDeclaration.
The phrase (one of) on the right-hand side of a production signifies that each of the symbols on the following line or lines is an alternative definition.
作品右侧的短语(one of)表示以下行或行上的每个符号都是替代定义。
For example, the lexical grammar contains the production:
ZeroToThree:
(one of)
`0 1 2 3`
which is merely a convenient abbreviation for:
ZeroToThree:
`0`
`1`
`2`
`3`
When an alternative in a production appears to be a token, it represents the sequence of characters that would make up such a token.
当生产中的替代项似乎是标记时,它表示构成此类标记的字符序列。
Thus, the production:
BooleanLiteral:
(one of)
`true` `false`
is shorthand for:
BooleanLiteral:
`true`
`false`
The right-hand side of a production may specify that certain expansions are not permitted by using the phrase "but not" and then indicating the expansions to be excluded.
生产的右侧可以使用短语“but not”,然后指出要排除的扩展,从而指定不允许某些扩展。
For example:
Identifier:
IdentifierChars but not a ReservedKeyword or BooleanLiteral or NullLiteral
Finally, a few nonterminals are defined by a narrative phrase in roman type where it would be impractical to list all the alternatives.
最后,一些非终结符由罗马字体的叙述短语定义,在这里列出所有替代方案是不切实际的。
For example:
RawInputCharacter:
any Unicode character