Handling a Non-Terminal Matched Too Early

SyntaxEditor for WPF Forum

Posted 8 months ago by Will Gauthier
Version: 21.1.3
Avatar

I have an identifierAndArguments non-terminal that can appear zero or more times at the end of a prefixExpression non-terminal. However, it's being matched too early as an optional middle of a different production which doesn't end up being correct, so the parsing halts. How can I add handling so that the parser recovers from the incorrect match, moves up a level, and reuses the identifierAndArguments in the correct way?

My Lua grammar includes these five productions (reproduced here in simplified form):

prefixExpression.Production = variableOrExpression + identifierAndArguments.ZeroOrMore();

variableOrExpression.Production = @identifier + variableSuffix.ZeroOrMore();

variableSuffix.Production = identifierAndArguments.ZeroOrMore() + (@openBracket + expression + @closeBracket | @dotOperator + @identifier);

identifierAndArguments.Production = (@colon + @identifier).Optional() + arguments;

arguments.Production = literalString

The code snippet foo"hello" is causing the Language Designer's Parser Debugger to silently fail when it encounters them. The snippet should come back as a prefixExpression.

First foo matches as an @identifier token in the variableOrExpression production. Then it tries to find zero or more variableSuffix's. "hello" is a literalString, which fulfills the condition of zero or more identifierAndArguments as the start of a variableSuffix. However, there's no expression in brackets or dot-identifier afterwards, so everything stops.

What should be happening is that "hello" is an identifierAndArguments that goes at the end of the prefixExpression, not as a variableSuffix in the variableOrExpression match.

I suspect I need some sort of callback to help the parser discard the failed variableSuffix match, terminate the variableOrExpression match, and try again at the prefixExpression level. How do I do that?

Comments (4)

Answer - Posted 8 months ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Will,

I suspect that you would need to add a CanMatchCallback to your variableSuffix non-terminal.  Something like the following would be needed for your grammar above.  It's complicated since you are needing to verify that one of two tokens is present potentially far into the production.

variableSuffix.CanMatchCallback = CanMatchVariableSuffix;

// ...

private bool CanMatchVariableSuffix(IParserState state) {
	var tokenReader = state.TokenReader;
	tokenReader.Push();
	try {
		while (
			(tokenReader.LookAheadToken.Id == LuaTokenId.Colon) ||
			(tokenReader.LookAheadToken.Id == LuaTokenId.LiteralString)
			) {
		
			if (tokenReader.LookAheadToken.Id == LuaTokenId.Colon) {
				tokenReader.Advance();
				
				if (tokenReader.LookAheadToken.Id != LuaTokenId.Identifier)
					return false;
				tokenReader.Advance();
			}
			
			if (tokenReader.LookAheadToken.Id != LuaTokenId.LiteralString)
				return false;
			tokenReader.Advance();
		}
		
		switch (tokenReader.LookAheadToken.Id) {
			case LuaTokenId.OpenBracket:
			case LuaTokenId.DotOperator:
				return true;
		}
	}
	finally {
		tokenReader.Pop();
	}

	return false;
}

Hope that helps!


Actipro Software Support

Posted 8 months ago by Will Gauthier
Avatar

Thank you for the explanation and the sample code. A CanMatchCallback for variableSuffix looks like exactly what I need. However, I will need to extend the code you provided because my arguments non-terminal's production is more complicated than the simplified form I initially posted. It has three alternations:

arguments.Production = @openParenthesis + expressionList.Optional() + @closeParenthesis
                | tableConstructor
                | literalString

Instead of comparing tokenReader.LookAheadToken.Id with LuaTokenId.LiteralString, can I call arguments.CanMatch(state) with the same effect? If so, do I need to additionally implement a CanMatchCallback for the arguments non-terminal, or does CanMatch have a default implementation that will do the job?

Posted 8 months ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Will,

Regarding arguments, you should be able to call CanMatch(state) on any non-terminal.  That method will do a token "first set" check to make sure the look-ahead token can start the non-terminal and if a CanMatchCallback is specified, that will also be checked to ensure it passes.  Note that calls to the CanMatch method won't advance the token reader at all, so you really should only call it at the end of your own CanMatchCallback, if at all.


Actipro Software Support

Posted 8 months ago by Will Gauthier
Avatar

Thanks - the knowledge that CanMatch checks the first set tokens and any CanMatchCallback's specified is what I needed. I also appreciate the tip about advancing the token reader.

I tried it out, and everything seems to work. Thanks again for all the help.

The latest build of this product (v22.1.3) was released 20 days ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.