Trying to create a parser for VBScript, issues with root production

SyntaxEditor for WPF Forum

Posted 10 years ago by Jack Jackomel
Version: 13.2.0591
Avatar

I'm trying to build an LL(*) grammar to parse VBScript, starting from the VBScript language definition included with the editor, and I'm following the documentation and examples trying to figure them out.

I'm having a really hard time though, I believe I wrote enough of the grammar to get some results but my biggest issue is what to do with the root production.

I noticed that the SimpleLanguage parser example only ever declares Functions and nothing else and so the root production

this.Root.Production = functionDeclaration.OnError(AdvanceToDefaultState).ZeroOrMore();

 fails on any piece of code that doesn't contain only functions.

I've been trying unsuccessfully to figure exactly what to do to parse anything other than a single item, I tried to use | operators to specify multiple non-terminals in the hope of being able to parse them as well but the tree never goes very far.

What should I do in this case to be able to parse and generate an AST tree something like the following VBScript?

Private Const PAR_PH = " pH"

Public Parametri
Public verbose

Sub ShowMessage(msgText)
    MsgBox msgText
    Message = msgText
End Sub

Function Calcola()
    Dim dblValue, strValue
    
    'Other code here
    
    Calcola = True

End Function

Is there an example grammar that could help me understand better what to do?
I would have expected to be able to look at a VB.NET grammar definition but I can't find anything of the sort around.

This is the production that I made until now, as it stands right now it fails on the very first line, I edited most of this from the SimpleLanguage example and even added some new terminals to the language definition.

functionDeclaration.CanMatchCallback = CanAlwaysMatch;
            functionAccessExpression.CanMatchCallback = CanMatchFunctionAccessExpression;
            accessModifierOpt.CanMatchCallback = CanAlwaysMatch;
            
            this.Root.Production = functionDeclaration.OnError(AdvanceToDefaultState).ZeroOrMore();
            

            
            nl.Production = @lineTerminator + nl.Optional();

            

            constantDeclaration.Production = @accessModifier + @keyword["Const"] + @identifier + @operator["="] + expression + nl; 

            subDecl.Production =  @keyword["Sub"] + @identifier + @openParenthesis +
                                 functionParameterList.Optional() + @closeParenthesis  + block  + 
                                 @endSub + nl
            

            accessModifierOpt.Production = @accessModifier.Optional();

            functionDeclaration.Production = accessModifier.Optional() + @keyword["Function"] + @identifier + @openParenthesis +
                                 functionParameterList.Optional() + @closeParenthesis + block + @endFunction + nl;

            functionParameterList.Production = @identifier + (@punctuation + @identifier).ZeroOrMore();

            statement.Production = block | emptyStatement | @operator; 

            block.Production = @nl + (statement.ZeroOrMore()) + @nl;

            emptyStatement.Production = @documentEnd.ToTerm().ToProduction();

            
            primaryExpression.Production = numberExpression
                                           | functionAccessExpression
                                           | simpleName
                                           | parenthesizedExpression;

            functionAccessExpression.Production = @identifier + @openParenthesis + functionArgumentList.Optional() +
                                       @closeParenthesis;

            numberExpression.Production = @integerNumber.ToTerm().ToProduction();
            simpleName.Production = @identifier.ToTerm().ToProduction();
            parenthesizedExpression.Production = @openParenthesis + expression + @closeParenthesis;
            functionArgumentList.Production = expression + (@punctuation["."] + expression).ZeroOrMore();
            multiplicativeExpression.Production = primaryExpression +
                                                  ((@multiplication | @division) + multiplicativeExpression)
                                                      .Optional();

            additiveExpression.Production = multiplicativeExpression +
                ((@addition | @subtraction) + additiveExpression).Optional();

            equalityExpression.Production = additiveExpression +
                ((@operator["="] | @inequality) + equalityExpression).Optional();

            expression.Production = equalityExpression.ToTerm().ToProduction();

            assignmentStatement.Production = @keyword["Set"] + @identifier + @operator["="] + expression + nl;
            
            variableDeclarationStatement.Production = @dim + @identifier + (@punctuation[","] + @identifier).ZeroOrMore() + nl;

            statement.Production = block
                                   | variableDeclarationStatement
                                   | emptyStatement
                                   | assignmentStatement
                                   | constantDeclaration;

Comments (4)

Posted 10 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Jack,

We do have a robust full grammar for VB.NET but it's part of the .NET Languages Add-on, and you'd need to purchase its Blueprint source code to see it.  But let me give you what we have at our root of it and hopefully that will help:

this.Root = compilationUnit;

compilationUnit.Production = compilationUnitContent.OnError(ErrorCompilationUnit).ZeroOrMore().SetLabel("c").OnComplete(ValidateCompilationUnitContentOrder) + 
	@documentEnd.OnErrorContinue() 
	> Ast<CompilationUnit>().AddToCollectionProperty(cu => cu.Members, AstChildrenFrom("c", 1));

// Not in spec, but added to support more robust error handling
// Nests each actual content item since AttributeStatement forces nesting
compilationUnitContent.Production = optionStatement.OnErrorContinue().OneOrMore().SetLabel("stmts") > AstFrom("stmts")
	| importsStatements.OnErrorContinue().OneOrMore().SetLabel("stmts") > AstFrom("stmts")
	| attributeStatement["attr"] > AstFrom("attr")
	| namespaceMemberDeclaration.OnErrorContinue().OneOrMore().SetLabel("decls") > AstFrom("decls");

Our ErrorCompilationUnit method looks like:

private IParserErrorResult ErrorCompilationUnit(IParserState state) {
	AdvanceToNonTerminals(state, null, null, false, "OptionStatement", "ImportsStatements", "AttributeStatement", "NamespaceDeclaration", "TypeDeclaration");
	return ParserErrorResults.Continue;
}

The AdvanceToNonTerminals effectively skips ahead, iterating through tokens until one is found that can start one of the specified non-terminals.


Actipro Software Support

Posted 10 years ago by Jack Jackomel
Avatar

Thanks for the example, I think I almost got it but now I'm missing this AdvanceToNonTerminals function which seems to be all I might need to finally get started, could please you provide an example of that too?

Posted 10 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Jack,

The VB.NET version of that method is rather long and complex.  But there is an example of Advance methods in the "Getting Started 4d" QuickStart. 

Basically the gist is that you want to do something like in that sample's AdvanceToStatementOrBlockEnd.  Where the 'if' statement is, you could do similar things to how we do the CanMatch call.  As long as you retain your root non-terminals in fields like we do there, you can call CanMatch on each.  That's the easiest way to do things.


Actipro Software Support

Posted 10 years ago by Jack Jackomel
Avatar

Thanks for the tip, sadly implementing the parser was taking too long so we're skipping text parsing for now.

I hope I can get the chance in the future to figure how to properly do it in the future

The latest build of this product (v24.1.1) was released 2 months ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.