Posted 13 years ago by Krzysztof
Avatar
Hi,

I've somehow created a lexical parser - it works (colors words etc.), but I'm having trouble with my semantic parser (I don't need any outlining etc., just parsing).

I'm trying to copy and than change your simple language, the effects are miserable.

here is my compilation unit:
   <NonTerminal Key="CompilationUnit">
      <Production>
        <![CDATA[
                <% 
                    compilationUnit = new CompilationUnit();
                    compilationUnit.StartOffset = this.LookAheadToken.StartOffset;
                    System.Boolean errorReported = false;
                    while (!this.IsAtEnd) {
                        if (IsNonTerminal("Macro")) {
                            errorReported = false;
                    %>
                            { "Macro" }
                    <% 
                        }
                        else {
                            // Error recovery:  Advance to the next token since nothing was matched
                            if (!errorReported) {
                                this.ReportSyntaxError("Function declaration expected.");
                                errorReported = true;
                            }
                            this.AdvanceToNext();
                        }
                    }
                    compilationUnit.EndOffset = this.LookAheadToken.EndOffset;
                %>
            ]]>
      </Production>
    </NonTerminal>
and here is my Macro:
<NonTerminal Key="Macro">
            <Production>
                    <![CDATA[
                'DefineKeyword'
                'MacroIdentifier'
                'MacroIdentifier' | 'IntegerNumber' | ('OpenParenthesis' 'Space' 'Address' 'CloseParenthesis')
            ]]></Production>
        </NonTerminal>
I have also copied the compilationUnit's <AstNode> and <Declarations> - the help says they are optional, but apparently not for me... I haven't got the faintest idea what they do, I understand - more or less - the nonterminals only (and tokenIDs of course).

I generate the code by your grammar designer.

Below is a single line that should be matched:
#define TEN 10

The thing is, after the parser finds to the MatchMacro method (generated by designer) there is a 'DefineKeyword' (the #define word) ahead (a can see it in debugger by calling this.LookAheadTokenText) returns false saying there is an 'Unexpected Token'.

So my questions:
Can you show me how to write a parser that parses this simple line above (assume you already properly recognise tokens)?
What are AstNodes and Declarations tags for?

this will be it for the start. Thanks in advance.

Comments (6)

Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Hi Krzysztof,

Yes the semantic parser piece in the WinForms SyntaxEditor can be a bit convoluted and difficult to figure out. In our newer WPF and Silverlight versions, we have a completely new grammar framework for building parsers that is much more straightforward and easier to understand. But anyhow back onto your questions...

Usually the goal of a parser is to construct an AST of your document, which is a hierarchy of nodes that describe the document's structure. If you have an AstNodes section in your grammar, it will generate AST node classes for you that you can construct in your non-terminal productions. The node is meant to allow you to make a tree representation of what is being parsed.

Offhand your CompilationUnit production looks good. It's building a CompilationUnit root AST node.

For more help, look at our FunctionDeclaration sample in the Simple grammar. It's running through tokens and makes a FunctionDeclaration AST node. Then it adds it to the CompilationUnit instance and initializes properties on it based on what is found.

The end result is that you have a CompilationUnit node with FunctionDeclaration nodes in it. Then you can take that tree of data later on and use it to help show IntelliPrompt, etc.

For your specific sample, you might want something like this:
<NonTerminal Key="Macro">
    <Production><![CDATA[
        'DefineKeyword'
        'MacroIdentifier'
        <%
            var identifier = this.TokenText;
        %>
        'IntegerNumber'
        <%
            var number = this.TokenText;
            var macro = new Macro(identifier, number, this.Token.TextRange); 
        %>
    ]]></Production>
</NonTerminal>
That assumes you have a Macro AST node with a constructor that takes those arguments. In your snippet you may need to put parenthesis around the last line of the production where you have all the alternations.

Declarations contains code snippets that get injected to the semantic parser that is code generated.

Hope that helps.


Actipro Software Support

Posted 13 years ago by Krzysztof
Avatar
Ok, so I need to have an Ast node that takes 2 parameters.

Than:
'MacroIdentifier'
        <%
            var identifier = this.TokenText;
        %>
the above tells the parser, that there is a token 'MacroIdentifier' that has value given by this.TokenText - correct?

And as for the number:
'IntegerNumber'
        <%
            var number = this.TokenText;
            var macro = new Macro(identifier, number, this.Token.TextRange); 
        %>
what does this do? I guess the
var number = this.TokenText;
does the same what
var identifier = this.TokenText;
did to 'MacroIdentifier', but the rest?
Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Hi Krzysztof,

When you have something in single quotes like 'MacroIdentifier' that is just trying to match a token that has the key indicated in quotes. The this.TokenText property will return the text of the token that was last matched. So the "var identifier = this.TokenText;" line is just making a variable named "identifier" with whatever was matched by the MacroIdentifier token.

Same concept for number then. Then it's storing the identifier and number in an AST node. So you'd have to make sure you had an AST node called Macro that had Identifier and Number properties, which you'd define up in the AstNodes area. To add a custom constructor like I was calling, you'd have to add an AstNodeDeclarations block with it defined in the Macro AstNode definition. The Simple grammar shows this.

Once you create the Macro AST node, you should add it to your CompilationUnit somehow, meaning you'd need to make sure your CompilationUnit had a Members property or somethign similar, which is an AST node collection property you can add to.


Actipro Software Support

Posted 13 years ago by Krzysztof
Avatar
I've done a bit of coding and... This is going to be long so you'd better fetch a cup of coffe...

1.
Could you please tell me, how to write AstNode for my example and make it become ignorant to spaces and comments?

in my grammar I have to write it like this:

'DefineKeyword' 'Whitespace' 'MacroIdentifier' 'Whitespace' 'IntegerNumber'

otherwise it reports an error ("unexpected token")



Lets say: my lexer properly recognizes a defineKeyword, MacroIdentifier, IntegerNumber and comments and whitespaces. Compilation unit looks like before. My "Macro" non-terminal is supposed to look like this:

'DefineKeyword' 'MacroIdentifier' 'IntegerNumber'

it is supposed to ignore whitespaces and comments - so that the example below would go ok.
Example:
#define identifier /*comment*/ number


2.

            <AstNode> 
                <!-- NOTE: No Name attribute on an AstNode element will target the AstNode base class for the language...
                            No attributes or child elements other than the AstNodeDeclarations element are currently recognized for the base node.
                    -->
                <AstNodeDeclarations><![CDATA[    

                    /// <summary>
                    /// Gets the image index that is applicable for displaying this node in a user interface control.
                    /// </summary>
                    /// <value>The image index that is applicable for displaying this node in a user interface control.</value>
                    public override int ImageIndex {
                        get {
                            return (int)ActiproSoftware.Products.SyntaxEditor.IconResource.Keyword;
                        }
                    }
                                    
                ]]></AstNodeDeclarations>
What does this do? What is that ImageIndex for?


3. In Ast nodes you sometimes create <AstNodeDeclarations ...> with some c# code, close the tag and immediately open another AstNodeDeclarations tag with some complete c# functions this time (e.g. like in CompilationUnit). So:
- how do I know, what do I need to put in these AstNodeDeclarations tags?
- what CAN be put here
- how do i know where it will be inserted?

For instance: in "AssignmentStatement" of your simple language AstNode has the <AstNodeDeclarations Type="Constructor"> with it's constructor and then another <AstNodeDeclarations> with DisplayText function, whereas CompilationUnit has <AstNodeDeclarations Type="Field"> with some omnious variables and than another with complete functions and no constructor is here.
Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Hi Krzysztof,

1) To ignore things like whitespace and comments, you have to create a custom RecursiveDescentLexicalParser class like we did in our SimpleRecursiveDescentLexicalParser class for the Simple language. It serves as a bridge between the lexer and parser. The parser asks it for tokens and thus it can filter out whitespace and comment tokens from ever getting to the parser. Then you won't need to code for whitespace or comments in your grammar.

Your Macro AST node should be something like this:
<AstNode Name="Macro" Description="A macro.">
    <AstNodeProperty PropertyType="Simple" Name="Identifier" Type="System.String" Description="The identifier." />
    <AstNodeProperty PropertyType="Simple" Name="Number" Type="System.Int32" Description="The number." />
    <AstNodeDeclarations Type="Constructor"><![CDATA[    

        public Macro(string identifier, int number, TextRange textRange) : this(textRange) {
            this.Identifier = identifier;
            this.Number = number;
        }
                        
    ]]></AstNodeDeclarations>
</AstNode>
The only change you'd need then from the code I previously pasted would be to convert the number variable to be an int so it can be passed to this constructor.

2) Image index is optional. It just provides a handy way to point to an image within an ImageList in scenarios where the AST node will be shown in a TreeView or other UI. You can ignore it though.

3) You can read the "SyntaxEditor Parser Generator Grammar XML Definition Specification / AstNodeDeclarations Tag" documentation topic for more information on what AstNodeDeclarations does and what Type attribute values are possible. Type basically just determines where in the generated code the declarations snippet will appear.

All it does is inject custom code into the AstNode class that is code generated. So it's an easy way to inject additional code above and beyond what is normally code generated, like extra constructors, etc. You don't have to do any declarations if you don't want to.

Hope that helps!


Actipro Software Support

Posted 13 years ago by Krzysztof
Avatar
thanx, that puts some light to it, I hope to work it out finally.
The latest build of this product (v24.1.0) was released 1 month ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.