Text input is 11MB...Syntax editor nonperformant

Posted 19 years ago by Bob Rundle - Director, Dynamic Workflow, JOA Oil & Gas BV

I am trying to use the SyntaxEditor on some very big files.

The example file I am working with is 11MB. This is actually a small file for what I am expecting the user to throw at it.

I put together a test problem and this is what I find

CPU Memory Notes
4:27 747MB 11 MB File, outlining + language definition
2:25 767MB 11 MB File, language definition only
0:44 623MB 11 MB File, no language definition
0:06 110MB 11 MB File, RichTextBox

I am quite astonished at the inefficiency of the SyntaxEditor. 4 1/2 minutes if I want outlining! 2 1/2 minutes if I want a language! And why when there is no language definition is the memory consumption 6 times that of the RichTextBox?

What am I missing here? Why is this thing so inefficient?

Regards,
Bob Rundle

Comments (3)

Posted 19 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Bob,

This is all being addressed in v4.0. We have gone through the entire product and reduced memory. For instance document line memory has been reduced probably 6 times for normal usage. Undo/redo uses a lot less. We rewrote some core structures to cut down memory usage. When certain features aren't in use, we no longer take up the memory needed for them. An example is if you turn off word wrap and outlining, the display lines are no longer tracked and simply base themselves on document lines.

Another big thing is that languages are very abstracted. This allows you to get really low level and make a completely optimized language definition. Dynamic languages (the name given to v3.1 style languages) have to store a lot of information in each token to support language merging and other features. In v4.0 you can create your own token classes instead of using the DynamicToken since tokens are now an IToken interface. The only minimum requirements for them to display is that they provide a StartOffset/Length and an ID. Actually SyntaxEditor will display highlighting without a length too. So at a minimum you can use an int for the StartOffset and a byte for the ID. That's 5 bytes as opposed to 20-some for each token. That is a huge cut down when you consider how many tokens are in a large document.

Performancewise, you can write a programmatic lexical parser for your language that doesn't take part in our mergable language features. If you do that, you can really increase the speed of your lexical parser, possibly many times.

Other things... v4.0 lets you disable various parsing features. So you can turn off outlining, semantic parsing, and lexical parsing all independently. With all those off, a 10MB file loads almost instantly. In our C# add-on for v4.0, we use a programmatic lexical parser and can lexically parse and highlight a 10MB doc in 6 seconds. Turning on semantic parsing and outlining will slow it down a little more though. And that language is designed to be mergable too so if we took out the merging features, it would speed up possibly by up to another good chunk.

As you can see, a lot of improvements in these areas are in v4.0. We're anxious to get it out.

Actipro Software Support

Posted 19 years ago by Bob Rundle - Director, Dynamic Workflow, JOA Oil & Gas BV

This is indeed good news. We would like to be Beta testers.

However we believe that this will not be enough for us. We would really like to have a tool that does not require reading in everything at the start but faults in text from disk as it is needed.

We realize that there is probably not a universal need for this kind of control, which is why we have been asking for a source code license for the control so we can implement this feature. We have not gotten any response from our inquiries into availability of source code for this control. Is there any hope for us?

We have done a compehensive survey of all editing controls on the market and the Actipro control is the best there is. But it is still not good enough for us. Perhaps 4.0 will change our minds, but I doubt it. If we cannot get source code we will have to write our own control from scratch.

Regards
Bob Rundle

Posted 19 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Bob,

We make it a point to always reply to every e-mail we receive. Rarely (but it has happened) our spam filter catches valid emails from customers. We try to monitor the spam folders for false positives but perhaps we missed yours.

Try sending your email again and maybe reply to this post right after you do so we can be sure to look in our spam filter in case it got caught there.

We're always happy to reply to customers.

Actipro Software Support

The latest build of this product (v25.1.0) was released 1 month ago, which was after the last post in this thread.

Comments (3)

Add Comment