LLParser and Grammar help needed: Token at DocumentEnd

Posted 13 years ago by Christel

Version: 12.1.0561

I didn't dare to ask for such a special pattern, but I was trying now for days and couldn't find a solution. As you always help so fast and no question seemed too stupid for you I'll try it with the following problem:

This Token No.1 as to be recognized: 12345*(DocumentEnd)

This token can as well be part of some more text, not at the document end, e.g. 12345*,45678,55678
=>Token No. 2, same as Token No. 1, but not at the DocumentEnd.

For these tokens there are this definitions in the grammar class (code snippets only):

// Terminals
var @documentEnd = new Terminal(MengeTokenId.DocumentEnd, "Definitionsende");
var @whitespace = new Terminal(MengeTokenId.Whitespace, "Whitespace");
var @newLine = new Terminal(MengeTokenId.NewLine, "NewLine");
var @komma = new Terminal(MengeTokenId.Komma, "Komma");
var @wildcard = new Terminal(MengeTokenId.Wildcard, "Wildcard") {ErrorAlias = "'*'"};
var @wildcardAtEnd = new Terminal(MengeTokenId.WildcardAmEnde, "Wildcard") { ErrorAlias = "'*'" };
var @gop = new Terminal(MengeTokenId.GopVorhanden, "Gop");
var @gopAtEnd = new Terminal(MengeTokenId.GopVorhandenAmEnde, "GopAmEnde");

//NonTerminals
this.Root = new NonTerminal("Mengendefinition");
var definitionsKlasse = new NonTerminal("DefinitionsKlasse") {ErrorAlias = "Mengendefinition"};
var gopItem = new NonTerminal("GopDefinitonsElement") { ErrorAlias = "GOP-Definition" };	
var gopAmEnde = new NonTerminal("GopAmEnde");
var endItem = new NonTerminal("ItemEnde");
var gopMitWildcardAtEnd = new NonTerminal("GopWithWildcardAtEnd");

//Callbacks
@gop.CanMatchCallback = CanMatchGop;

//Non-terminal productions
this.Root.Production = gopItem.OnError(GopItemError).ZeroOrMore().SetLabel("Definition")
				.OnInitialize(DefinitionInitialize).OnSuccess(GopItemSuccess).OnComplete(DefinitionComplete)
				> Ast("Mengendefinition", AstChildrenFrom("Definition"));


endItem.Production = @documentEnd | @komma | @newLine | @regexpOr["endItem"] > Ast("Oder", AstFrom("endItem"));

				
gopItem.Production =  
  gopAmEnde["GopItem"] > AstFrom("GopItem")						
| @gop["GopItem"] > Ast("GopDefinitionsElement", AstFrom("GopItem"))
| endItem
| gopMitWildcardAtEnd["GopItem"] > AstFrom("GopItem")
;

gopMitWildcardAtEnd.Production=
	@gop["gopItem"]
	+ @wildcard
	+ endItem.OnErrorContinue() 
	> AstFrom("gopItem");

gopAmEnde.Production =
	@gopAtEnd["gopItem"]
	+ endItem.OnErrorContinue() 
	> AstFrom("gopItem");



private bool CanMatchGop(IParserState state)
{
	state.TokenReader.Push();
	try
	{
		return (state.TokenReader.GetLookAheadToken(1).Id == MengeTokenId.GopVorhanden
			&& 
			(state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.CloseKlammer
				|| state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.Bis
				|| state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.Plus
				|| state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.RegexpOder
				|| state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.DocumentEnd
				|| state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.Komma
				|| state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.RegexpOder
				|| state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.Wildcard
				|| state.TokenReader.GetLookAheadToken(2).Id == MengeTokenId.NewLine));
	}
	finally
	{
		// the old state back into place. 
		state.TokenReader.Pop();
	}
}

And the Patterns in the Lexer class(code snippets only):

 // Create lexical macros
    this.LexicalMacros.Add(new DynamicLexicalMacro("Gnr", "[\\d]{5}[A-Z*]?"));

lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Regex, "GopVorhanden", classificationTypeProvider.GOP);
lexicalPatternGroup.TokenId = MengeTokenId.GopVorhanden;
lexicalPatternGroup.LookAheadPattern = "[,]|[\\*]|[-]|[+]|[)]|[|]|[[]|{LineTerminator}";
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("{Gnr}"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Regex, "GopVorhandenAmEnde", classificationTypeProvider.GOP);
lexicalPatternGroup.TokenId = MengeTokenId.GopVorhandenAmEnde;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("{Gnr}"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Regex, "Whitespace", null);
lexicalPatternGroup.TokenId = MengeTokenId.Whitespace;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("{Whitespace}+"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Regex, "NewLine", null);
lexicalPatternGroup.TokenId = MengeTokenId.NewLine;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("[\\n\\r]+"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Explicit, "Negation", classificationTypeProvider.GOP);
lexicalPatternGroup.TokenId = MengeTokenId.Negation;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("!"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Explicit, "Komma", classificationTypeProvider.GOP);
lexicalPatternGroup.TokenId = MengeTokenId.Komma;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern(","));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Explicit, "Punkt", classificationTypeProvider.GOP);
lexicalPatternGroup.TokenId = MengeTokenId.Punkt;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("."));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Explicit, "RegexpOder", classificationTypeProvider.Wildcard);
lexicalPatternGroup.TokenId = MengeTokenId.RegexpOder;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("|"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Regex, "Wildcard", classificationTypeProvider.Wildcard);
lexicalPatternGroup.TokenId = MengeTokenId.Wildcard;
lexicalPatternGroup.LookAheadPattern = "[,|+)]|{LineTerminator}";
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("\\*"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Explicit, "Bis", classificationTypeProvider.GOP);
lexicalPatternGroup.TokenId = MengeTokenId.Bis;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("-"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);
lexicalPatternGroup = new DynamicLexicalPatternGroup(DynamicLexicalPatternType.Explicit, "Plus", classificationTypeProvider.GOP);
lexicalPatternGroup.TokenId = MengeTokenId.Plus;
lexicalPatternGroup.Patterns.Add(new DynamicLexicalPattern("+"));
lexicalState.LexicalPatternGroups.Add(lexicalPatternGroup);

Token No. 2 and the whole string is parsed by LLParser correctly and all expected ASTNodes occur.

Token No. 1 is not recognized, only the last sign '*' is tokenized as Wildcard.

I trried to explicit define a Non-Terminal for exactly this pattern, put the NonTerminal "endItem" to the gopDefinitions, and some different various things. Nothing helped.

May be, I just did miss a small detail or sign.

Looking really forward to your help! I Hope, you got all the informations you need to understand.

Thanks!

Comments (2)

Answer - Posted 13 years ago by Christel

Hi all,

solved!!!

It is, as I expected, it was only a small sign I didn't look at.

In the Macro-Definition of the {Gnr} I changed:

this.LexicalMacros.Add(new DynamicLexicalMacro("Gnr", "[\\d]{5}[A-Z*]?"));

this.LexicalMacros.Add(new DynamicLexicalMacro("Gnr", "[\\d]{5}[A-Z]?[*]?"));

Now it works.

Forgot my long question!

Sorry

Answer - Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Christel,

I'm glad you found the problem.

I was also going to say, if you have the same pattern for this token but depending on if it's at the end of the document, it can be one token or the other then I'd think the best place to do it would be in the lexer. You'd make two pattern groups with the same pattern. But one should have a look-ahead set that validates it's at the document end. The other would have a look-ahead set to validate it's not at the document end.

The regex guide in the documentation tells you the syntax to check document end.

Actipro Software Support

The latest build of this product (v25.1.0) was released 2 months ago, which was after the last post in this thread.

Comments (2)

Add Comment