Resizing & reparsing

Comments (5)

Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA

1) Unless you have word wrap on, it shouldn't be recalculating display lines. Maybe build a simple sample project that shows this and email it to our support address so we can take a look.

2) If your language is designed such that ' can span multiple lines then what most likely is happening is it is lexically parsing all the way to the end of the document every time you press "'" because you "toggle" all string portions with non-string portions.

In the next generation design we're doing for the WPF version of SyntaxEditor (which we want to eventually port back to WinForms later), it only highlights on-demand for what is needed to support the display. So this would be more of a non-issue in that scenario since there it wouldn't need to update through the end of the document.

However for now, there's probably not much you can do if you language supports multiline strings other than writing a programmatic lexer since those can generally parse up to several times faster than a dynamic language one. So there it still would be doing the same thing but at least it would be doing it faster.

Actipro Software Support

Posted 16 years ago by liggett78

Regarding #2 above:
Now that I've implemented my own lexer (similar to how you did that in the tutorial for Simple) the problem is not going away. The profiler indicates that with my document 80sec are spent in u.a(Document, TextRange, int, int, ILexicalParseTarget) > TokenCollection.OnTokenParsed > TokenCollection.a(Int32, Int32) > System.Collections.ArrayList.Insert(Int32, Object). The lexer itself is only for a subset of SQL and is fast - it takes around 1sec for 75k calls to parse my document.

Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA

The problem there is probably that the Document.Tokens collection is being mostly altered when you make those changes. So in other words, your change removes thousands of tokens and thousands of new different tokens are added.

I loaded our dynamic SQL language and opened a file over 1MB. I see a similar issue due to the "toggle" effect, but with ours it would take 15-30 seconds to reparse. This is with the outlining on too.

I think the problem really requires a fundamental redesign of how we do syntax highlighting. The good news is that we are prototyping out such a fundamental change in our design for the WPF SyntaxEditor, as mentioned in my previous post. The bad news is that until that design is done and well tested, we won't be able to be ported over to the WinForms version.

Some tips for you though. A programmatic language should help since it is faster than a dynamic one. Also, be sure you make tokens as long as possible. That reduces the number of tokens you are adding/removing with these sorts of changes. This means for things like whitespace, combine all line feeds and spaces, etc. into a single token. Maybe make string and comment content a single token for each line instead of one for each word in the content. Those sorts of optimizations will help for now.

Actipro Software Support

Posted 16 years ago by liggett78

Well, yeah this is what I figured out day before yesterday that it might help to have longer tokens. So I do combine spaces and linefeeds, I'm trying to combine table names (schema+actual name) and such, since they are not to be colorized. This reduces the amount of tokens by 10-20 percent, but still I cannot skip semicolons and parens and need to report them as token, because otherwise they will get the same color as the next keyword.

But I'm just wondering if the document is reparsed until the cold dead end every time I "toggle", why is SE fiddling with/removing/adding tokens at all? Why not just discard all tokens from the current position on and add all newly parsed? It would be much more efficient, because after all this is what is done when I assign 1MB of text to the document.

Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA

Yes due to the nature of your language, the "toggle" means it does have to reparse to the end of the document from the change location. Loading a full document will go slightly faster than the toggle since in the new load, we just add instead of remove and add. However because we are incremental parsing, the incremental parser doesn't "know" that the toggle will go all the way through to the end of the document until it actually gets that far. At which point it's too late to do an optimization.

Actipro Software Support

The latest build of this product (v25.1.0) was released 1 month ago, which was after the last post in this thread.

Comments (5)

Add Comment