Help eliminating left recursion

SyntaxEditor for WPF Forum

Posted 9 years ago by Phil Devaney - Senior Software Engineer, Serck Controls Ltd
Version: 10.2.0531
Avatar
Hi,
I am trying to convert an MGrammar grammar to use your LLParser, which has been mostly straightforward, but I am having trouble eliminating left recursion from a couple of productions. My grammar is for an expression evaluator with syntax similar to C#, with the usual literals, variables, binary & unary operators and function calls. It also has the member access operator (.) and subscript/indexer operator ([]), and it is these last two I am having problems with.

My MGrammar definition looks like this (tokens begin with T and should be obvious from their names):

syntax PrimaryExpr =
    LiteralExpr
|   ParenthesizedExpr
|   FunctionCallExpr
|   MemberAccessExpr
|   ElementAccessExpr;

...

syntax MemberAccessExpr =
    TIdentifier
|   PrimaryExpr TPeriod TIdentifier;

syntax ElementAccessExpr =
    PrimaryExpr TOpenBracket Expr TCloseBracket;
I have tried various ways of defining this, but can't find a way that isn't left recursive while still allowing for arbitrary chaining (e.g a[0].b[1].c.d). I don't think a CanMatch callback can help as there can be an arbitrarily complex expression on the left (e.g. ( x + y * z ).b). I guess you might have encountered a similar problem if you are working on a C# grammar the .NET Languages addon. If so, how did you solve it?

Another issue I encounted (and solved) was with strings. Is it expected to have to define string literals as NonTerminals like this:

stringLiteral.Production =
  @stringStart + charList["chars"] + @stringEnd
    > Ast( "StringLiteral", AstChildrenFrom( "chars" ) );

charList.Production =
  ( @stringText["c"] > AstFrom( "c" ) | @stringEscape["c"] > AstFrom( "c" ) ).ZeroOrMore().SetLabel( "chars" )
    > Ast( "CharList", AstChildrenFrom( "chars" ) );
or is there a way to get the lexer to emit the full string literal as a Terminal?

Finally, because my parser DLL references the .NET 4 version of WPF (for SolidColorBrush etc), it causes the parser debugger to crash when trying to load it as the Language Designer tool uses the v2 runtime. This is easily solved by making it run under the v4 runtime by creating a .exe.config and adding a <supportedRuntime> element (see http://msdn.microsoft.com/en-us/library/w4atty68.aspx) but I thought it might be a good idea to add a note to the documentation about this.

Thanks

Phil

Comments (6)

Posted 9 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Hi Phil,

We've very close to starting work on the .NET Languages Add-on but haven't just yet. Still we ran into similar things when writing the WinForms version's grammar.

There are two ways to go. For simpler cases, you can assign a can-match callback to one or both of the ambiguous non-terminals. Then in your can-match callback you peek ahead at upcoming tokens to determine if the non-terminal should match or not. We talk about this in our documentation and the Simple language shows a sample too.

However in more complex scenarios, you may need to take an alternative approach. In your case both productions start with PrimaryExpr. PrimaryExpr could be many tokens long since it is a recursive non-terminal, so while implementing a can-match callback to handle all its cases is possible, it would be tricky.

The alternative idea which we did in WinForms a lot was to create a new non-terminal whose production started with PrimaryExpr. Then add an alternation to MemberAccessExpr and ElementAccessExpr and in each of those non-terminals, remove the PrimaryExpr from their productions since you already consumed them with this new non-terminal. Then finally back in PrimaryExpr, you'd remove the refs to MemberAccessExpr and ElementAccessExpr and would ref this new one instead.

We'll certainly run into this too when we start work on the C#/VB grammars soon and perhaps we'll have some more tips for you then based on what we encounter.


Actipro Software Support

Posted 9 years ago by Phil Devaney - Senior Software Engineer, Serck Controls Ltd
Avatar
Thanks, I'm not quite sure if it's what you intended but I came up with this:

primaryExpr.Production =
    primaryNoAccessExpr + ( elementAccessExpr | memberAccessExpr ).ZeroOrMore();

primaryNoAccessExpr.Production =
    literalExpr | parenthesizedExpr | functionCallExpr | simpleName;

elementAccessExpr.Production =
    @openBracket + expr + @closeBracket;

memberAccessExpr.Production =
    @period + @identifier;
It seems to work OK after a limited amount of testing. Now I just need to work out the right tree construction!

Phil
Posted 9 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Yes, and actually now that I look deeper at our WinForms code we do that sort of thing as well for primary expression. We use the technique described in the previous post in some other spots.


Actipro Software Support

Posted 9 years ago by Phil Devaney - Senior Software Engineer, Serck Controls Ltd
Avatar
BTW, are you aware Microsoft's "Oslo" project has been cancelled? See http://blogs.msdn.com/b/modelcitizen/archive/2010/09/22/update-on-sql-server-modeling-ctp-repository-modeling-services-quot-quadrant-quot-and-quot-m-quot.aspx. They say the "M" stuff will resurface but I'm not holding my breath.

I'd already decided to ditch MGrammar before this was announced, due to the lack of updates for the past year and a number of outstanding issues I had, as well as it not being compatible with the .NET 4 Client Profile.
Posted 9 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Thanks for the update, hopefully they'll still do something with it in the future. But at least we have our own grammar shipping now that you can use.


Actipro Software Support

Posted 9 years ago by Boyd - Automated Software Testing Consultant / Developer, Patterson Consulting, LLC
Avatar
To add to Phil's suggestion of using a config file to tell the LLParser Debugger to run with .NET 4, I thought I'd past the actual text you need if you want to do this. Just create a file called 'LanguageDesigner.exe.config' that contains the following text:

<configuration>
   <startup>
      <supportedRuntime version="v4.0.30319"/>
   </startup>
</configuration>
Place the file next to the 'LanguageDesigner.exe' file. That will tell the debugger to run with the .NET 4 runtime and allow you to debug your languages that target .NET 4.
The latest build of this product (v2019.1 build 0683) was released 2 months ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.