An abstract syntax tree (AST) is a structural tree representation of source code, without the specific details of the language, such as punctuation. An AST is comprised of AST nodes, each which represent a particular construct in the concrete syntax.
An Example
Here is an example of an AST for two simple expressions. Notice how the constructs of the original expressions are communicated through the AST, but language-specific syntactical details (such as operator precedence and punctuation) are excluded. The AST is correct for both expressions.
Sample expression 1:
1 + 4 / 7
Sample expression 2:
1 + (4 / 7)
Sample AST:
"+" [
1
"/" [
4
7
]
]
Introduction to IAstNode
The IAst
IParser implementations generally create some sort of AST (a tree of IAst
For instance, a code outliner could consume the IAst
IAstNode Members
Member | Description |
---|---|
Children Property | Gets the list containing the child AST nodes of this AST node. |
Contains Method | Returns whether the AST node contains the specified offset. |
End |
Gets or sets the end offset of the AST node, if known. |
Find |
Searches through the child nodes for a node that contains the specified offset. |
Find |
Recursively searches through the descendant nodes for a node that contains the specified offset. |
Has |
Gets whether the AST node contains any child AST nodes. |
Length Property | Gets the character length of this AST node, if known. |
Start |
Gets or sets the start offset of the AST node, if known. |
To |
Outputs the contents of the AST node in tree form. |
The DefaultAstNode Class
The Default
This class is good to use when first prototyping out a grammar or in scenarios where type-specific AST nodes are not required. It is the default AST node type used by the LL(*) Parser Framework's tree constructors that don't designate a specific AST node type.
Type-Specific AST Nodes
Sometimes it is beneficial to have specific types of AST nodes, where a distinct .NET class is created for each type of AST node. Take this snippet of C# for example, a simple class declaration:
class Foo {}
With the default AST nodes discussed above, a resulting AST would be something like:
ClassDeclaration[
Name[
"Foo"
]
]
It's easy to imagine how large the AST grows as you get into much more complex code since additional AST nodes are typically used to describe the context of their contained nodes, such as with the Name
node above.
In contrast, assume we build a class called ClassDeclaration
like this:
The ClassDeclaration
class is a type-specific AST node, where instead of wrapping our Foo
value node with another Name
node, we simply set a string property called Name
to the value Foo
. The snippet that originally required three AST nodes to represent, now just uses one AST node with a property set. Thus, the overall complexity of the AST is reduced.
Another major benefit of using type-specific AST nodes is that since they are .NET classes, you can fully extend them with partial classes, etc. This makes it easy to add helper methods/properties or to override the default ToString
result.
The only downside of using type-specific AST nodes is that they can take more time to develop, since you need to define a class for each type of AST node your language grammar can generate. Luckily, we've made this almost a non-issue since the Language Designer tool has full type-specific AST node code generation features. All you have to do is indicate a few settings for each AST node type and its properties, and the code will be generated for you.
The AstNodeBase Class
The abstract Ast
The only requirement of inheritors is that they override its Get
For instance, a ClassDeclaration
class might define a Members
property where each member is represented by another AST node. In this scenario, the Get
If your type-specific AST nodes are generated by the Language Designer tool, all this work is done for you.