Token Reader and performance on large files

SyntaxEditor for WPF Forum

Posted 5 years ago by Simon Sprott
Version: 14.2.0610
Avatar

Hi

We are trying to optimise our application to work with large files, and on of the whole the main bottleneck we are hitting is calls to GetReader(offset) where we are specifying large offsets (the call that actually takes the time is the ITextSnapshotReader.Token, but I guess its lazy loading it).

I'm guessing its lexing the whole file up to the offset (which is time consuming for large files)?

Is it possible to cache the Lexer tokens, or get a token reader in the middle of a file without incuring such a hit?

 

The kind of senarios we are looking at are auto indent, where we need the token stream, or spell checking where we need the tokens to break out specific types of text to check.

Thanks

Simon

Comments (2)

Posted 5 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Simon,

SyntaxEditor has optimizations in it where it tries to leave little breadcrumbs for each line in the current snapshot so that lexing can pick up at the closest line when determining tokens.  So under normal circumstances, when you get a reader and start reading tokens, it will get the closest incremental lexing state and begin there.  It shouldn't have to go to the top of the document in that case.

If you are using another snapshot that isn't in sync with that lexer incremental state data, then it would need to start lexing from the top of the document.

In scenarios like delimiter higlighting and auto-completion where we also have to scan tokens, I believe it's been pulling from the incremental lexing data fine though, allowing it to perform fast.  I'm not sure what is different in your scenario.


Actipro Software Support

Posted 5 years ago by Simon Sprott
Avatar

Thanks, I re-worked the code and it works as you described.

Basically I'm trying to replace the Winforms SyntaxEditor with the WPF SyntaxEditor as it seems to cope much better with large files. Their is a lot of our code which uses it, so its difficult spotting the coulprit when we see performace issues.

The latest build of this product (v2019.1 build 0683) was released 1 month ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.