Getting my tokens

SyntaxEditor for WPF Forum

Posted 7 years ago by Josh Luth - Software Developer, Esha Research
Version: 13.1.0580
Avatar

This seems like it should be a fairly basic and straight forward answer, but it is eluding me.

I have defined my own language. It has 3 pattern groups defined, we'll call it Pattern A, B, and Whitespace. Grammatically they will be presented as A + Whitespace + B.

The A pattern group has some Regex patterns, the B pattern group has some Explicit patterns.

In my SyntaxEditor, I am listening to SyntaxEditor.DocumentTextChanged. At that point I want to parse my A into a property and parse my B into a property. So I have two parse methods passing in the CurrentSnapshot.

I get a reader from that snapshot and then look at the tokens. The token for A matches and I get A. When it goes to parse the rest of the snapshot for B, it reads each character as a Token (a default token) rather than reading the entire token that matches one of the explicit patterns. Below is my code that I'm using to parse the snapshot for Pattern B.

 

 private void parsePatternB(ITextSnapshot snapshot)
        {
            _myStuff = null;

            var reader = snapshot.GetReader(0);
            var token = reader.Token;

            if (token.Id == TokenId.MyStuff)
            {
                _myStuff = reader.TokenText;
            }
            else
            {
                while (reader.GoToNextToken())
                {
                    token = reader.Token;

                    if (token.Id != TokenId.MyStuff)
                    {
                        continue;
                    }


                    _myStuff= reader.TokenText;
                    break;
                }
            }

            updateThings();
        }

 Is there an easier way to get my Tokens out of the CurrentSnapshot?

Comments (5)

Posted 7 years ago by Josh Luth - Software Developer, Esha Research
Avatar

Ok I did a bit more digging and it appears what is really going on is that my TokenTaggerProvider is created before my Explicit patterns are added to my Lexer. The order of operations in creation seems extremely rigid here. In my SyntaxLanguage generated class, the Lexer is registeed, then the TokenTagger. I subclass my SyntaxLanguage and Register my parser passing in a Collection of Patterns to add to the lexer. So when "CreateTokenReader" is called, all of the patterns are there. But it seems that this is too late. Is there a better way to do it? Code samples below.

 

[System.CodeDom.Compiler.GeneratedCodeAttribute("LanguageDesigner", "12.2.573.0")]
    public partial class MySyntaxLanguage : SyntaxLanguage {
        
        /// <summary>
        /// Initializes a new instance of the <c>MySyntaxLanguage</c> class.
        /// </summary>
        public MySyntaxLanguage() : 
                base("My") {

            // Create a classification type provider and register its classification types
            MyClassificationTypeProvider classificationTypeProvider = new MyClassificationTypeProvider();
            classificationTypeProvider.RegisterAll();

            // Register an ILexer service that can tokenize text
            this.RegisterService<ILexer>(new MyLexer(classificationTypeProvider));

            // Register an ICodeDocumentTaggerProvider service that creates a token tagger for
            //   each document using the language
            this.RegisterService(new MyTokenTaggerProvider(classificationTypeProvider));

            // Register an IExampleTextProvider service that provides example text
            this.RegisterService<IExampleTextProvider>(new MyExampleTextProvider());
        }
    }

 

public class MyLanguage : MySyntaxLanguage
    {
        public MyLanguage(ICollection<Pattern> patterns)
        {
            this.RegisterParser(new MyLanguageParser(measures));
            RegisterService(new CodeDocumentTaggerProvider<ParseErrorTagger>(typeof(ParseErrorTagger)));
            RegisterService(new SquiggleTagQuickInfoProvider());
            RegisterService(new AdornmentManagerProvider<MyAdormentManager>(typeof(MyAdormentManager)));
        }
    }

 

public class MyLanguageParser : LLParserBase
    {
        private readonly ICollection<Pattern> _patterns;

        public AmountLanguageParser(ICollection<Patter> patterns)
            : this(new AmountLanguageGrammer())
        {
            _patterns = patterns;
        }

        /// <summary>
        /// Initializes a new instance of the <c>LLParserBase</c> class.
        /// </summary>
        /// <param name="grammar">The <see cref="P:ActiproSoftware.Text.Parsing.LLParser.Implementation.LLParserBase.Grammar"/> to use.</param>
        public MyLanguageParser(Grammar grammar)
            : base(grammar) { }

        /// <summary>
        /// Creates an <see cref="T:ActiproSoftware.Text.Parsing.LLParser.ITokenReader"/> that is used by the parser to read through tokens.
        /// </summary>
        /// <param name="reader">The <see cref="T:ActiproSoftware.Text.ITextBufferReader"/> that provides access to the text buffer.</param>
        /// <returns>
        /// An <see cref="T:ActiproSoftware.Text.Parsing.LLParser.ITokenReader"/> that is used by the parser to read through tokens.
        /// </returns>
        public override ITokenReader CreateTokenReader(ITextBufferReader reader)
        {
            var myLexer = new MyLexer(new MyClassificationTypeProvider());

            using (myLexer.CreateChangeBatch())
            {
                var patternGroups = myLexer.DefaultLexicalState.LexicalPatternGroups;
                var thePatternGroup = patternGroups.FirstOrDefault(g => g.TokenId.Equals(MyTokenId.Xxx));

                if (thePatternGroup != null)
                {
                    thePatternGroup.Patterns.Clear();

                    foreach (var pattern in _patterns)
                    {
                        thePatternGroup.Patterns.Add(new DynamicLexicalPattern(pattern.ToString()));
                    }
                }
            }

            return new MyTokenReader(reader, myLexer);
        }
    }
Posted 7 years ago by Josh Luth - Software Developer, Esha Research
Avatar

So I figured out my issue. I had two instances of my lexer. One was created in the Language and one was being created in my Parser. So now I just pass in the already created one from the Language into the Parser and everything is working as expected.

Posted 7 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Josh,

Based on your findings, I think the problem isn't the order that you do things but rather that your parsing code isn't effectively using the modification you've made for the lexer used in the token reader.

In your CreateTokenReader code you create a token reader and pass it the updated lexer to use.  That's all good and should work.

But back in your first post, you have code that uses a call to snapshot.GetReader.  That will end up using the ILexer that is registered on your language, and not the one from the parser's token reader.  If you are using our LL(*) Parser Framework, I'm not really sure how the parsePatternB code comes into play.  But if you would be reading tokens using the ITokenReader instead, it would probably be giving you correct results.

Your solution of using the same lexer might not be a good idea though because the parser will generally operate on a separate thread from the UI, where the normal language lexer is used.  So you could run into cross thread issues if you do that.


Actipro Software Support

Posted 7 years ago by Josh Luth - Software Developer, Esha Research
Avatar

So are you saying rather than passing my one lexer around, i should continue to create the lexer in my parser and then do something like this in my "parsePatternB" code?

 

var reader =
((MyLanguageParser) SyntaxEditor.Document.Language.GetParser()).CreateTokenReader(
snapshot.GetMergedBufferReader());
Posted 7 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Josh,

I'm not quite sure how you are using the parsing capabilities of our product right now, like if you are using our LL(*) Parser Framework or are doing the parsePatternB completely out of band of an IParser, etc.  So I couldn't really say the best approach without knowing more or seeing a simple example.  You could explain more what's going on in terms of architecture or write our support address with a sample project showing everything.  Thanks!


Actipro Software Support

The latest build of this product (v2019.1 build 0683) was released 2 months ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.