Performance Tokentext

Posted 20 years ago by tobias weltner

I am sorry to fill this forum up these days ;-) so many questions, and so little time to eval your control.
Anyway, I noticed a considerable performance hit using this:
Editor.Document.GetTokenText(token)

On a 3000-line-code with 700 tokens to find, it takes a couple of *seconds* to get the token texts.
Is there a faster way to do this?
Background is that I need to find the token texts of all tokens identified as variable to put them into a listview.
I first suspected the listview to be the bottleneck but it really is the way how syntaxeditor extracts token texts.

Would it for example be possible to store the token text inside the semanticparsedata once the token is created?
I would need to do that inside of the semanticparser I suspect. But how would I be notified if the token text changes later but the token type remains the same?

What I'd think I could do is use the token.modified-property and if it is modified then add its text as semanticparsedata.

[Modified at 06/01/2005 04:39 AM]

Comments (6)

Posted 20 years ago by tobias weltner

While I have solved the problem now by using semanticParseData from within the semanticparser, I still would love to know if there is a faster way to access the tokentext...

When I implement a semanticParseData class that exposes a property like "Tag", why can't I write:
token.semanticparsedata.tag
?
Instead, I need to do it this way:
dim myclass as myparsedata
myclass = token.semanticparsedata
myclass.tag...

Is there a more straightforward way to do this?

[Modified at 06/01/2005 05:08 AM]

Posted 20 years ago by Actipro Software Support - Cleveland, OH, USA

In the semantic parser's PostParse method you always know the range of offsets that were lexically parsed.

For your semantic parse data question, remember that .NET is strongly typed so it only knows that a SemanticParseData is an object with an interface that has no members. Instead of what you did, you can cast the data to your myparsedata type by using CType().

As for getting text, we use a StringBuilder in the background to store document text. So calls to our GetSubstring method actually have to do this:

return textBuffer.ToString().Substring(offset, length);

Note it's building a large string of the text buffer before returning the text substring. That's because StringBuilder has no Substring method on it. The indexer of the Document class calls the indexer of the StringBuilder directly.

One optimization we could try is seeing if it speeds things up to build the substring ourselves for GetSubstring, using the indexer of the StringBuilder as long as the length requested was under a certain amount. We had compared the method we use now with that strategy a long time back and ended up choosing what we did because overall it gave better performance. However maybe a hybrid approach would be best. We'll look into that.

But back on your topic... it might currently be fastest if you just grab Document.Text once before all your parsing and use the Substring method on that since that way you aren't constructing the string version of the Document's StringBuilder on every call to GetSubstring.

[Modified at 06/01/2005 05:48 AM]

Actipro Software Support

Posted 20 years ago by tobias weltner

very cool suggestion... I just tried it out, and it works 1000x faster... thanx! I now don't even need semanticparsedata as a means of caching any more...
By the way: your support here in the forum is excellent, it is a pleasure working with you guys!

Posted 20 years ago by Actipro Software Support - Cleveland, OH, USA

We did some experimenting with various ways of storing/getting text substrings and the way we have it now is best. However if you are going to be doing a lot of substring searches on the same text, it is fastest to get the Document.Text, store it in a string and call the Substring method on that string like Tobias did. That will run faster than repeated calls to Document.GetSubstring.

Actipro Software Support

Posted 20 years ago by Actipro Software Support - Cleveland, OH, USA

Just FYI, we made a code change that will prevent you from needing to do anything special and should return substrings very fast now. It will appear in the next maintenance release. One user reported that on a huge document with lots of outlining parsing and token parsing, he did notice a visual difference in speed increase.

Actipro Software Support

Posted 20 years ago by tobias weltner

Thanks for the info.
I am pretty happy already with the workaround and will check whether your new approach parallels the speed or beats it... ;-)

The latest build of this product (v25.1.0) was released 21 days ago, which was after the last post in this thread.

Comments (6)

Add Comment