原文地址:https://javacc.org/jjtree
http://www.yikemm.com/wordpress/2018/01/31/%E7%BF%BB%E8%AF%91%EF%BC%9Ajjtree-%E5%8F%82%E8%80%83%E6%96%87%E6%A1%A3/
介绍:
JJTree is a preprocessor for JavaCC™ that inserts parse tree building actions at various places in the JavaCC source. The output of JJTree is run through JavaCC to create the parser. This document describes how to use JJTree, and how you can interface your parser to it.
JJtree 是JavaCC的一个预处理器,它能在JavaCC源文件的许多位置插入解析树构建动作。JJTree的输出是通过运行JavaCC来创建解析器。本文描述如何使用JJTree以及如何将你的解析操作连接到它。
By default JJTree generates code to construct parse tree nodes for each nonterminal in the language. This behavior can be modified so that some nonterminals do not have nodes generated, or so that a node is generated for a part of a production’s expansion.
默认情况下,JJTree通过生成代码为语言中的每个非终结符构建解析树节点。这个行为可被修改,使得一些非终结符不会生成节点,或为产生式扩展的某一部分生成节点。
JJTree defines a Java interface Node that all parse tree nodes must implement. The interface provides methods for operations such as setting the parent of the node, and for adding children and retrieving them.
JJTree定义了一个java Interface Node,所有的解析树节点都实现该接口。该接口提供某些方法供操作,例如:设定节点parent或添加孩子节点,或检索节点。
JJTree operates in one of two modes, simple and multi (for want of better terms). In simple mode each parse tree node is of concrete type SimpleNode; in multi mode the type of the parse tree node is derived from the name of the node. If you don’t provide implementations for the node classes JJTree will generate sample implementations based on SimpleNode for you. You can then modify the implementations to suit.
JJTree有两种操作模式,simple 和 multi(想要获得更好的操作)。在simple模式下每个解析树节点是一个具体的类型SimpleNode;在multi模式下解析树节点的类型来自于相应节点的名字。如果你不提供node类的实现,JJTree会为你生成简单的实现,该实现基于SimpleNode。然后你可以通过修改实现来适应你得需求。
Although JavaCC is a top-down parser, JJTree constructs the parse tree from the bottom up. To do this it uses a stack where it pushes nodes after they have been created. When it finds a parent for them, it pops the children from the stack and adds them to the parent, and finally pushes the new parent node itself. The stack is open, which means that you have access to it from within grammar actions: you can push, pop and otherwise manipulate its contents however you feel appropriate. SeeNode Scopes and User Actionsbelow for more important information.
虽然JavaCC是一个自顶向下的解析器,但JJTree是自底向上来构造解析树的。它使用stack数据结构来存储已经创建的节点。当发现节点的父节点时,它从stack中pop出孩子节点并添加到parent下,并最终push新的parent节点到stack中。stack是开放的,你可以push,pop或其他你认为合适的操作stack的内容。查看Node Scopes and User Actions章节获悉更多实现信息。
JJTree provides decorations for two basic varieties of nodes, and some syntactic shorthand to make their use convenient.
JJTree为节点的两个基本变种提供装饰,和一些简写语法使得它们便于使用。
A definite node is constructed with a specific number of children. That many nodes are popped from the stack and made the children of the new node, which is then pushed on the stack itself. You notate a definite node like this:#ADefiniteNode(INTEGER EXPRESSION)A definite node descriptor expression can be any integer expression, although literal integer constants are by far the most common expressions.
1、 一个definite node是由特定数量的孩子构成的。有许多节点从stack pop出来组成新节点的孩子节点,然后将自己push到stack中,你可以像这样指定一个definite node:
#ADefiniteNode(INTEGER EXPRESSION)
一个definite node描述表达式可以使多个integer 表达式,尽管文字整型常量是迄今为止最常见的表达式。
2.A conditional node is constructed with all of the children that were pushed on the stack within its node scope if and only if its condition evaluates to true. If it evaluates to false, the node is not constructed, and all of the children remain on the node stack. You notate a conditional node like this:
#ConditionalNode(BOOLEAN EXPRESSION)
2、一个conditional node由所有在其作用域下的已经push到stack的孩子节点组成,当且仅当条件判定为true时。如果条件判定为false,则node不会被构造,并且所有孩子节点保留在stack中。你可以像下面这样指定一个conditional node:
#ConditionalNode(BOOLEAN EXPRESSION)
A conditional node descriptor expression can be any boolean expression. There are two common shorthands for conditional nodes:
Indefinite nodes#IndefiniteNodeis short for#IndefiniteNode(true)
Greater-than nodes#GTNode(>1)is short for#GTNode(jjtree.arity() > 1)
The indefinite node shorthand (1) can lead to ambiguities in the JJTree source when it is followed by a parenthesized expansion. In those cases the shorthand must be replaced by the full expression. For example:
( ... ) #N ( a() )
is ambiguous; you have to use the explicit condition:
( ... ) #N(true) ( a()
)
一个conditional 节点描述表达式可以是很多个boolean 表达式。有两种一般的简写:
1、Indefinite nodes
#IndefiniteNode是#IndefiniteNode(true) 的简写
2、Greater-than nodes
#GTNode(>1)是#GTNode(jjtree.arity() > 1)的简写
indefinite node的简写表达式可能导致歧义,当它后边跟随跟随的是一个括号表达式。在这种情形下必须替换为完整表达式,例如:
( … ) #N ( a() ) 是有歧义的,你可以使用清晰的条件:
( … ) #N(true) ( a() )
WARNING: node descriptor expressions should not have side-effects. JJTree doesn’t specify how many times the expression will be evaluated.
警告:node描述表达式不应该有副作用。JJTree不指定表达式被计算多少次。
By default JJTree treats each nonterminal as an indefinite node and derives the name of the node from the name of its production. You can give it a different name with the following syntax:
void P1() #MyNode : { ... } { ... }
默认情况下,JJTree把每个非终结符当成indefinite node,并且节点名字来源于名字对应的产生式。你可以使用如下语法指定一个不同的名字:
void P1() #MyNode : { ... } { ... }
When the parser recognizes aP1nonterminal it begins an indefinite node. It marks the stack, so that any parse tree nodes created and pushed on the stack by nonterminals in the expansion forP1will be popped off and made children of the nodeMyNode.
当解析器识别P1这个非终结符时,它将初始一个indefinite node。并为它构建stack,因此在P1非终结符下创建的任何解析树节点会push到stack中,并且pop出来构建MyNode 节点的孩子节点。
If you want to suppress the creation of a node for a production you can use the following syntax:
void P2() #void : { ... } { ... }
如果你不想为某个产生式创建节点,你可以使用如下语法:
void P2() #void : { … } { … }
Now any parse tree nodes pushed by nonterminals in the expansion ofP2will remain on the stack, to be popped and made children of a production further up the tree. You can make this the default behavior for non-decorated nodes by using theNODE_DEFAULT_VOIDoption.
现在由在表达式P2里的非终结符创建的解析树节点都保留在stack中,被用于构造解析树的孩子节点。你
可以将这个设定为默认行为,当节点未被装饰时,通过设定 NODE_DEFAULT_VOID选项。
void P3() : {}
{
P4() ( P5() )+ P6()
}
In this example, an indefinite nodeP3is begun, marking the stack, and then aP4node, one or moreP5nodes and aP6node are parsed. Any nodes that they push are popped and made the children ofP3. You can further customize the generated tree:
void P3() : {} { P4() ( P5() )+ P6() }
在这个例子中,indefinite node P3 是开始,构建stack,然后是P4节点,一个或多个P5节点和一个P6节点被解析。
任意节点被push和pop都是作为P3的孩子节点。你可以进一步定制生成树。
void P3() : {}
{
P4() ( P5() )+ #ListOfP5s P6()
}
Now theP3node will have aP4node, aListOfP5snode and aP6node as children. The#Nameconstruct acts as a postfix operator, and its scope is the immediately preceding expansion unit.
void P3() : {} { P4() ( P5() )+ #ListOfP5s P6() }
现在P3节点将拥有一个P4节点,一个ListOfP5s节点和一个P6节点作为孩子节点。#Name构造一个后置操作,并且它的作用域是紧跟着扩展单元的。
Node Scopes and User Actions 节点作用域和用户动作
Each node is associated with a node scope. User actions within this scope can access the node under construction by using the special identifierjjtThisto refer to the node. This identifier is implicitly declared to be of the correct type for the node, so any fields and methods that the node has can be easily accessed.
每个节点关联着一个节点作用域。用户动作在节点作用域下可以通过使用jjtThis标识符引用到该节点。这个标识符定义为该节点的类型,因此该节点的任意方法和属性都可以很方便的访问到。
A scope is the expansion unit immediately preceding the node decoration. This can be a parenthesized expression. When the production signature is decorated (perhaps implicitly with the default node), the scope is the entire right hand side of the production including its declaration block.
作用域是节点装饰前紧接的扩展单元。这可以是一个括号表达式。当一个产生式签名被装饰(也许是隐式的默认节点),它的作用域是产生式整个右侧,并包含它的定义块。
You can also use an expression involvingjjtThison the left hand side of an expansion reference. For example:
... ( jjtThis.my_foo = foo() ) #Baz ...
你可以在一个扩展引用的左侧使用包含jjtThis的表达式。例如
… ( jjtThis.my_foo = foo() ) #Baz …
HerejjtThisrefers to aBaznode, which has a field calledmy_foo. The result of parsing the productionfoo()is assigned to thatmy_foo.
这里jjtThis引用一个Baz节点,它拥有一个my_foo属性。这导致解析产生式foo()时赋值给my_foo。
The final user action in a node scope is different from all the others. When the code within it executes, the node’s children have already been popped from the stack and added to the node, which has itself been pushed onto the stack. The children can now be accessed via the node’s methods such asjjtGetChild().
在节点作用域最后的用户动作不同于其它。当其中的代码在执行时,节点的孩子节点已经从stack中pop处理并添加到了该节点,并且将自身push到stack中。可以通过该节点的jjtGetChild()方法来访问孩子节点。
User actions other than the final one can only access the children on the stack. They have not yet been added to the node, so they aren’t available via the node’s methods.
除了最后一个用户动作,其它的都只能访问在stack中的孩子节点。他们还未被加入到节点,因此不能通过节点的方法访问到。
A conditional node that has a node descriptor expression that evaluates to false will not get added to the stack, nor have children added to it. The final user action within a conditional node scope can determine whether the node was created or not by calling thenodeCreated()method. This returns true if the node’s condition was satisfied and the node was created and pushed on the node stack, and false otherwise.
一个conditional node的描述表达式被计算为false时,不会被push到stack,并且没有孩子节点加入到它。最后的用户动作在conditional node作用域里可以通过nodeCreated()方法判断节点是否被创建。返回true时表示节点创建并push到stack,false则没有。
Exception handling 异常处理
An exception thrown by an expansion within a node scope that is not caught within the node scope is caught by JJTree itself. When this occurs, any nodes that have been pushed on to the node stack within the node scope are popped and thrown away. Then the exception is rethrown.
The intention is to make it possible for parsers to implement error recovery and continue with the node stack in a known state.
WARNING: JJTree currently cannot detect whether exceptions are thrown from user actions within a node scope. Such an exception will probably be handled incorrectly.
在一个节点作用域中,扩展抛出的异常并没有在节点的作用域下捕获,而是由JJTree所捕获。当异常发生时,在该节点作用域下push进stack中的所有节点被pop出并抛弃。然后异常重新抛出。
其目的是使解析器能够实现错误恢复,并且在一个已知的状态下继续使用节点stack。
警告:JJTree目前不能探测到在一个节点作用域下异常是否由用户动作抛出。这样的异常很可能不能被正常处理。
If theNODE_SCOPE_HOOKoption is set to true, JJTree generates calls to two user-defined parser methods on the entry and exit of every node scope. The methods must have the following signatures:
void jjtreeOpenNodeScope(Node n)
void jjtreeCloseNodeScope(Node n)
如果NODE_SCOPE_HOOK选项启用,JJTree生成两个用户定义的解析方法调用,在进入和退出节点作用域时。方法签名如下:
void jjtreeOpenNodeScope(Node n)
void jjtreeCloseNodeScope(Node n)
If the parser isSTATICthen these methods will have to be declared as static as well. They are both called with the current node as a parameter.
One use might be to store the parser object itself in the node so that state that should be shared by all nodes produced by that parser can be provided. For example, the parser might maintain a symbol table.
void jjtreeOpenNodeScope(Node n)
{
((SimpleNode)n).jjtSetValue(getSymbolTable());
}
void jjtreeCloseNodeScope(Node n)
{
}
WheregetSymbolTable()is a user-defined method to return a symbol table structure for the node.
如果解析器是STATIC的,那么这些方法将被定义为static的。他们都将被调用,并以当前节点为参数。
一种用处是存储解析器对象自己到节点中,这样该解析器创建的所有节点将共享状态。例如,解析器可能维护一个符号表。
void jjtreeOpenNodeScope(Node n) { ((SimpleNode)n).jjtSetValue(getSymbolTable()); }
void jjtreeCloseNodeScope(Node n) { }
It is often useful to keep track of each node’s first and last token so that input can be easily reproduced again. By setting theTRACK_TOKENSoption the generatedSimpleNodeclass will contain 4 extra methods:
public Token jjtGetFirstToken()
public void jjtSetFirstToken(Token token)
public Token jjtGetLastToken()
public void jjtSetLastToken(Token token)
The first and last token for each node will be set up automatically when the parser is run.
常用于保持对节点开始和结尾token的跟踪,这样可以简单的将输入重建一遍。通过设定TRACK_TOKENS选项生成的SimpleNode类会包含4个额外方法。
public Token jjtGetFirstToken()
public void jjtSetFirstToken(Token token)
public Token jjtGetLastToken()
public void jjtSetLastToken(Token token)
The Life Cycle of a Node 节点的生命周期
A node goes through a well determined sequence of steps as it is built. This is that sequence viewed from the perspective of the node itself:
节点在构建过程中经历一系列确定的过程。以下是从节点自身角度透视它的顺序:
the node’s constructor is called with a unique integer parameter. This parameter identifies the kind of node and is especially useful in simple mode. JJTree automatically generates a file calledparserTreeConstants.java that declares valid constants. The names of constants are derived by prepending JJT to the uppercase names of nodes, with dot symbols (“.”) replaced by underscore symbols (“_”). For convenience, an array ofStrings calledjjtNodeName[]that maps the constants to the unmodified names of nodes is maintained in the same file.
1、 节点的构造函数调用会以一个唯一的整数为参数。这个参数标识节点类型,这在simple模式下尤其有用。JJTree自动生成一个i称作 parserTreeConstants.java的文件定义有效的常量。常量名是JJT+节点名大写的方式生成,使用.替换所有_。为简便,在同一个文件中,一个String类型的jjtNodeName[]数组映射所有不可修改的节点名常量。
2.the node’sjjtOpen()method is called.
2、jjOpen方法被调用
3.if the optionNODE_SCOPE_HOOKis set, the user-defined parser methodopenNodeScope()is called and passed the node as its parameter. This method can initialize fields in the node or call its methods. For example, it might store the node’s first token in the node.
3、如果NODE_SCOPE_HOOK选项启用,用户定义的openNodeScope()方法将被调用,并以节点自身为参数。这个方法可以初始化字段或调用节点中的方法。例如,它可能存储下节点的第一个token。
4.if an unhandled exception is thrown while the node is being parsed then the node is abandoned. JJTree will never refer to it again. It will not be closed, and the user-defined node scope hookcloseNodeHook()will not be called with it as a parameter.
4、如果一个未处理的异常在节点解析时抛出,则该节点将被抛弃。JJTree再不会引用它。它将被close,并且用户定义的closeNodeHook方法不会被调用。
5.otherwise, if the node is conditional and its conditional expression evaluates to false then the node is abandoned. It will not be closed, although the user-defined node scope hookcloseNodeHook()might be called with it as a parameter.
5、另外,如果是conditional 节点,并且条件表达式计算得false,则该节点被遗弃。它不回被关闭,虽然用户定义的节点钩子函数closeNodeHook()方法会被调用。
6.otherwise, all of the children of the node as specified by the integer expression of a definite node, or all the nodes that were pushed on the stack within a conditional node scope are added to the node. The order they are added is not specified.
6、另外,一个definite节点的数字表达式所指定的所有孩子节点,或者是一个conditional节点所有push到stack中的所有节点会被加入节点。它们被添加的顺序未指定。
7.the node’sjjtClose()method is called.
8.the node is pushed on the stack.
9.if the optionNODE_SCOPE_HOOKis set, the user-defined parser methodcloseNodeScope()is called and passed the node as its parameter.
10.if the node is not the root node, it is added as a child of another node and itsjjtSetParent()method is called.
7、jjtClose方法被调用。
8、节点被push到stack中。
9、如果NODE_SCOPE_HOOK选项启用,则用户定义的closeNodeScope方法被调用。
10、如果不是根节点,它会被设定为其它节点的孩子节点,jjSetParent方法被调用。
Visitor Support 访问支持
JJTree provides some basic support for the visitor design pattern. If theVISITORoption is set to true JJTree will insert anjjtAccept()method into all of the node classes it generates, and also generate a visitor interface that can be implemented and passed to the nodes to accept.
JJTree提供为访问者设计模式提供简单的支持,如果VISITOR选项启用,JJTree将在节点生成时,插入jjAccept()方法,并且生成visitor接口,接口可以被实现并传入节点。
The name of the visitor interface is constructed by appendingVisitorto the name of the parser. The interface is regenerated every time that JJTree is run, so that it accurately represents the set of nodes used by the parser. This will cause compile time errors if the implementation class has not been updated for the new nodes. This is a feature.
visitor接口名字的构成,通过给在解析器的名字后加上Visitor。接口会在JJTree每次运行时重新生成,因此它能准确代表解析器使用的节点集合。这将导致编译失败,因为实现类没有更新。
Options
JJTree supports the following options on the command line and in the JavaCC options statement:
BUILD_NODE_FILES(default:true)
Generate sample implementations for SimpleNode and any other nodes used in the grammar.
MULTI(default:false)
Generate a multi mode parse tree. The default for this is false, generating a simple mode parse tree.
NODE_DEFAULT_VOID(default:false)
Instead of making each non-decorated production an indefinite node, make it void instead.
NODE_CLASS(default:"")
If set defines the name of a user-supplied class that will extendSimpleNode. Any tree nodes created will then be subclasses of NODE_CLASS.
NODE_FACTORY(default:"")
Specify a class containing a factory method with following signature to construct nodes:
public static Node jjtCreate(int id)
For backwards compatibility, the valuefalsemay also be specified, meaning thatSimpleNodewill be used as the factory class.
NODE_PACKAGE(default:"")
The package to generate the node classes into. The default for this is the parser package.
NODE_EXTENDS(default:"")Deprecated
The superclass for the SimpleNode class. By providing a custom superclass you may be able to avoid the need to edit the generated SimpleNode.java. See the examples/Interpreter for an example usage.
NODE_PREFIX(default:"AST")
The prefix used to construct node class names from node identifiers in multi mode. The default for this is AST.
NODE_SCOPE_HOOK(default:false)
Insert calls to user-defined parser methods on entry and exit of every node scope. SeeNode Scope Hooks above.
NODE_USES_PARSER(default:false)
JJTree will use an alternate form of the node construction routines where it passes the parser object in. For example,
public static Node MyNode.jjtCreate(MyParser p, int id);
MyNode(MyParser p, int id);
TRACK_TOKENS(default:false
Insert jjtGetFirstToken(), jjtSetFirstToken(), getLastToken(), and jjtSetLastToken() methods in SimpleNode. The FirstToken is automatically set up on entry to a node scope; the LastToken is automatically set up on exit from a node scope.STATIC (default: true)Generate code for a static parser. The default for this is true. This must be used consistently with the equivalent JavaCC options. The value of this option is emitted in the JavaCC source.VISITOR (default: false)Insert a jjtAccept() method in the node classes, and generate a visitor implementation with an entry for every node type used in the grammar.VISITOR_DATA_TYPE (default: "Object")If this option is set, it is used in the signature of the generated jjtAccept() methods and the visit() methods as the type of the data argument.VISITOR_RETURN_TYPE (default: "Object")If this option is set, it is used in the signature of the generated jjtAccept() methods and the visit() methods as the return type of the method.VISITOR_EXCEPTION (default: "")If this option is set, it is used in the signature of the generated jjtAccept() methods and the visit() methods.JJTREE_OUTPUT_DIRECTORY (default: use value of OUTPUT_DIRECTORY)By default, JJTree generates its output in the directory specified in the global OUTPUT_DIRECTORYsetting. Explicitly setting this option allows the user to separate the parser from the tree files.
JJTree state
JJTree keeps its state in a parser class field calledjjtree. You can use methods in this member to manipulate the node stack.
final class JJTreeState {
/* Call this to reinitialize the node stack. */
void reset();
/* Return the root node of the AST. */
Node rootNode();
/* Determine whether the current node was actually closed and
pushed */
boolean nodeCreated();
/* Return the number of nodes currently pushed on the node
stack in the current node scope. */
int arity();
/* Push a node on to the stack. */
void pushNode(Node n);
/* Return the node on the top of the stack, and remove it from the
stack. */
Node popNode();
/* Return the node currently on the top of the stack. */
Node peekNode();
}
Node Objects
/* All AST nodes must implement this interface. It provides basic
machinery for constructing the parent and child relationships
between nodes. */
public interface Node {
/** This method is called after the node has been made the current
node. It indicates that child nodes can now be added to it. */
public void jjtOpen();
/** This method is called after all the child nodes have been
added. */
public void jjtClose();
/** This pair of methods are used to inform the node of its
parent. */
public void jjtSetParent(Node n);
public Node jjtGetParent();
/** This method tells the node to add its argument to the node's
list of children. */
public void jjtAddChild(Node n, int i);
/** This method returns a child node. The children are numbered
from zero, left to right. */
public Node jjtGetChild(int i);
/** Return the number of children the node has. */
int jjtGetNumChildren();
}
The classSimpleNodeimplements theNodeinterface, and is automatically generated by JJTree if it doesn’t already exist. You can use this class as a template or superclass for your node implementations, or you can modify it to suit.SimpleNodeadditionally provides a rudimentary mechanism for recursively dumping the node and its children. You might use this is in action like this:
{
((SimpleNode)jjtree.rootNode()).dump(">");
}
TheStringparameter todump()is used as padding to indicate the tree hierarchy.
Another utility method is generated if the VISITOR options is set:
{
public void childrenAccept(MyParserVisitor visitor);
}
This walks over the node’s children in turn, asking them to accept the visitor. This can be useful when implementing preorder and postorder traversals.
Examples
JJTree is distributed with some simple examples containing a grammar that parses arithmetic expressions. See the fileexamples/JJTreeExamples/READMEfor further details.
There is also an interpreter for a simple language that uses JJTree to build the program representation. See the fileexamples/Interpreter/READMEfor more information.
Information about an example using the visitor support is inexamples/VTransformer/README.