How to add a TokenKey programmatical? - SyntaxEditor for WPF Forum

Posted 13 years ago by Christel

Version: 12.1.0561

Hi,

following situation:

At runtime I expand the xml-langdef-file for our custom language with a new ExplicitPatternGroup like this.

XElement extension = new XElement("{http://schemas.actiprosoftware.com/langdef/1.0}ExplicitPatternGroup",
	new XAttribute("Key", "GopMenge"),
	new XAttribute("TokenId", 200),
	new XAttribute("TokenKey", "GopMenge"),
	new XAttribute("ClassificationTypeKey", "GopMenge"),
	new XElement("{http://schemas.actiprosoftware.com/langdef/1.0}ExplicitPatterns",

	new XCData(formattedExtensionMengen))
	);

I can do it only at runtime, because the XData-String "formattedExtensionMengen" is only known when application started.

After that, I use a SyntaxLanguageDefinitionserializer to initialize the language with the extended langdef-File.

First question:
Is there another idea to add or extend a explicit pattern group during runtime?

Beside that I use the LLParserFramework with a CustomGrammar-class.

Now the problem - and following questions:

After parsing the editor document the string contained in the XDataString is recognized as a GopMenge-Token, but the recognized GopMenge-Token gets a red squiggleLine, although recognized as correct Token, and although the string is correct either.

The reason is, that the CustomGrammar-class does'nt have a definition of a Terminal or NonTerminal-Object of 'GopMenge'. And therefore the GopMenge-Token cannot be recognized as a valid AST.
The CustomGrammar-class have no Terminal-Definition for GopMenge, as the TokenId needed is only known at runtime and not before.

My idea now is to add a TokenId.GopMenge to the MengeTokenId class programmatically, by inheriting from the generated CustomTokenId.g.cs class

Then I can use this tokenId to define the appropriate Terminal and Non-Terminal in the Grammar class.

Is that useful? Are there any other Ideas? Perhaps a kind of placeholder somewhere in the Lexer?

Can you understand my problem?

I need a TokenId before runtime to use in CustomGrammar class, but it is earliest known at runtime.

Thanks for your appreciated help!!

Comments (7)

Answer - Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Christel,

Instead of doing it via modification of the XML, I would just modify the actual lexer once it's loaded. So add your pattern group to the XML but don't define any patterns in it, then load up the language definition normally. Then look in this documentation topic:

Text/Parsing Framework - Lexing / Basic Concepts

It has a section on "Changing Mergable Lexers." You'd do that and inside of the call it lists, cast it to a DynamicLexer and get the appropriate containing DynamicLexicalState. DynamicLexicalState has a LexicalPatternGroups collection that is indexed by the DynamicLexicalPatternGroup.Key property (which would be "GopMenge"). Get your pattern group and programmatically update the patterns.

By doing the above, you'd have a placeholder for this token kind and that should fix your grammar issue.

Actipro Software Support

Posted 13 years ago by Christel

That is exactly I'm looking for!

But I didn't know how and where to access the DynamciLexicalPatternGroup (GOPMenge).

With your answer, the documentation and the really helpful examples I'll try it know and feel very sure, that this will work perfectly!

Thanks a million for always quick and detailed help from experts!

Posted 13 years ago by Christel

Hi again,

I followed now your suggestions:

the langdef-file now contains a placeholder for the later updated DynamicLexicalPatternGroup, the pattern would be a whitespace. (As there is no possibility to define an empty patttern)
I load the langdef file - get the lexer and look for the pattern group placeholder, then programmatically update the patterns....

....and still have the same problem:

After parsing the editor document the string contained in the patterns is recognized as a GopMenge-Token (I verified it by implementing HitTest-interface),

but the recognized GopMenge-Token gets a red squiggleLine, although recognized as correct Token, and although the string is correct either.

I tried debugging it in the LLParser debugger, but there it worked correctly, no parse errors were shown. But in case of red squiggle line, the LLParser debugger will not show it in any case of errors. So it is clear, I ca'nt see it there.

When Debugging in my small test project, I can see that the TokenReader get a tokenId = 0 when it is reading the letters of the string.
The grammar class doesn't define any Terminal for single letters as this would not work together with all the other Terminals to verify.

Could it be, that this is the reason that an error occured in my OnError-Callback of the grammar class?

In the callback, for the moment, I do nothing than " return ParserErrorResults.Default; "
Later, I would like to report a special error-report there.

Another thing I realized is, that if the placeholder defined pattern contains only a whitespace (before it was modified programmatically), the red squiggle line occur only below the first sign of the editor string.
If the placeholder pattern contain a placeholder string of 5 letters, the squiggle line occur below the first 5 letters.
This made me think, that the TokenReader didn't get the changed lexical patterns?

I have to say, that the programmatically update of the pattern is not in the constructor of my syntax language but in the Setter of the property, where the new patterns are set to the editor. May be that could be a reason?
Must I do anything to actualize language of the editor, after updating the patterns with CreateChangeBatch?

Posted 13 years ago by Christel

Sorry, sorry sorry,forget the last comment!!!

Accidentally I found out where the problem is- in my grammar class in the OnErrorCallback.

But need again your help!!

This is, what happened in the OnError-callback:

private IParserErrorResult GopItemError(IParserState state)
{
    // Report a custom error
    state.ReportError(ParseErrorLevel.Error, "GOP (5 Ziffen, opt. 2 Buchstaben) erwartet");

    state.TokenReader.Advance();

    // Return a value telling the parser to not report errors and continue on
    return ParserErrorResults.Ignore;
}

But without

state.TokenReader.Advance() I will not get any red squiggle line, even if there is an error.
I did exactly the same as in your example from SampleBrowser.

Changing ParserErrorResults to

- ParserErrorResults.Default: only the first two letters where red line squiggled,
but no further squiggle lines will be shown for any other error, that occur

- ParserErrorResults.Ignore:the whole editor string for the GopMenge-pattern is red line squiggled,
and any other error will be quiggled as well

Any ideas?

Thanks and sorry for the last complicated comment.

Posted 13 years ago by Christel

Hi,

after sleeping over it, looking again in your documentation, that is what I have now:

OnError-Callback:

private IParserErrorResult GopItemError(IParserState state)
{
// Report a custom error
state.ReportError(ParseErrorLevel.Error, "GOP (5 Ziffen, opt. 2 Buchstaben) erwartet");
 
state.TokenReader.AdvanceTo(new int[]{MengeTokenId.Komma,MengeTokenId.DocumentEnd});
 
return ParserErrorResults.Continue;
}

After parsing the editor document the strings contained in the patterns are recognized as a "GopMenge"-Token (I verified it by implementing HitTest-interface),

First question: who recognize this string as "GopMenge"-Token, as the TokenReader doesn't (it always provide TokenId '0', see later)

Problem:

Every recognized GopMenge-Token get a red squiggleLine below the first letter, although recognized as correct Token, and although the string is correct either.

Could it be, that this is the reason that an error occured in my OnError-Callback of the grammar class?

How can I avoid an error reading the first letter of the strings, to avoid red squiggle lines?

Answer - Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Christel,

It's hard to say without debugging your grammar. But the lexer is always responsible for tokenizing text. You can assign different lexer types to the one used on the language for syntax highlighting, and one used for the grammar parser’s TokenReader. Normally both will be instances of the same lexer type. But perhaps you are assigning different lexer types if you are seeing different tokenization?

What you should do is load up your compiled parser assembly into our Language Designer's LL Parser Debugger and step through it for some simple sample code. You can debug it just like a normal language. It will do things like highlight the next token and you can see what errors occur. They will appear in an errors tool window (won’t have wavy lines). As you step through watch to see what the next allowable tokens are based on your grammar location and note the look-ahead token that is showing in the lower left tab as you step through the parsing. You'll be able to see where things match and where they start to go wrong. Then based on your error handling you can see if you are truly advancing past a bad token properly or repeatedly keep examining the same bad token. The documentation topics on error handling are very useful for assisting in design here too.

If your tokenId is 0 then it sounds like your lexer isn’t working correctly and isn’t tokenizing that portion of your code. So it might be feeding back single character tokens since it doesn’t know what to do with them.

Actipro Software Support

Posted 13 years ago by Christel

Thanks for reading all my comments patiently.

In fact you provide me the key in your last answer: I used two different lexer instances.

Now I changed this and all works as I expect!

Thanks for your help!!

The latest build of this product (v25.1.0) was released 2 months ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.