
I'm writing a Lua grammar, but what I have so far suffers from multiple ambiguities. I'm trying to resolve them using can-match callbacks, but I have questions about how they work/should be implemented even after reading the documentation.
The first ambiguity I'm addressing is that both of the variableOrExpression terminal's productions start with an openParenthesis:
variableOrExpression.Production = variable
| @openParenthesis + expression + @closeParenthesis;
variable.Production = (@identifier | @openParenthesis + expression + @closeParenthesis + variableSuffix) + variableSuffix.ZeroOrMore();
This is my can-match callback attempt:
bool CanMatchVariable(IParserState state)
{
if (state.TokenReader.LookAheadToken.Id == LuaTokenId.Identifier)
{
return true;
}
else if (state.TokenReader.LookAheadToken.Id == LuaTokenId.OpenParenthesis)
{
state.TokenReader.Push();
try
{
state.TokenReader.Advance();
if (expression.CanMatch(state) && state.TokenReader.LookAheadToken.Id == LuaTokenId.CloseParenthesis)
{
state.TokenReader.Advance();
if (variableSuffix.CanMatch(state))
{
return true;
}
}
}
finally
{
state.TokenReader.Pop();
}
}
return false;
}
My questions are:
- Do I have to worry about the variableSuffix.ZeroOrMore(), or is it sufficient to match only as few terminals and non-terminals as necessary for uniqueness?
- Am I correct in assuming that calling CanMatch on a terminal like expression will consume tokens, so I need the TokenReader in a Push/Pop block and LookAheadToken should indeed be CloseParenthesis?
I'm also working on a second ambiguity where the statement terminal includes two productions that start with the forKeyword and two with the localKeyword:
statement.Production = @semicolon
| variableList + @assignment + expressionList
| functionCall
| label
| @breakKeyword
| @gotoKeyword + @identifier
| @doKeyword + block + @endKeyword
| @whileKeyword + expression + @doKeyword + block + @endKeyword
| @repeatKeyword + block + @untilKeyword + expression
| @ifKeyword + expression + @thenKeyword + block + (@elseifKeyword + expression + @thenKeyword + block).ZeroOrMore() + (@elseKeyword + block).Optional() + @endKeyword
| @forKeyword + @identifier + @assignment + expression + @comma + expression + (@comma + expression).Optional() + @doKeyword + block + @endKeyword
| @forKeyword + identifierList + @inKeyword + expressionList + @doKeyword + block + @endKeyword
| @functionKeyword + functionName + functionBody
| @localKeyword + @functionKeyword + @identifier + functionBody
| @localKeyword + attributeIdentifierList + (assignment + expressionList).Optional();
Since there are so many alternations, will the can-match callback have to handle all of them? Effectively, does it wholly replace non-ambiguous things like label, or do I only need to make it handle the specific problem starting non-terminals?
[Modified 3 years ago]