Question about encoding

Posted 19 years ago by Karl Grambow

I was hoping someone might be able to help with a question I have about encodings and how SyntaxEditor's handling of it differs from a standard rtb.

I'm retrieving some data from a database and populating a rtb with that data, some of which is unicode data and therefore has the appearance of square characters (symbols) and such.

If I put that unicode data into a SyntaxEditor document (with UnicodeEnabled=true) it's perfect. However, if I put that data into a standard rtb it gets transformed into Chinese/Japanese characters (and subsequently the font has changed).

I could use SyntaxEditor in place of the rtb but I'm not using any color-coding and I need to be able to load lots of data so I need it to be fast.

To be honest, I can live with the way that the standard rtb transforms the text. After all, it makes no difference that either character (square symbols or Chinese character) is illegible.

But, what really is driving me nuts is that it IS different! And SyntaxEditor handles this perfectly so I'm just curious as to how SyntaxEditor does this and if I can easily apply the same method to a standard rtb.

Appreciate any advice I can get.

Thanks,

Karl

Comments (3)

Posted 19 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Karl,

In the improvements we're doing for version 4.0, we're eliminating the UnicodeEnabled property and just making it work by default. We also are making a ton of optimizations so that large file loading is much faster, and you can REALLY speed things up and use less memory by turning off various features.

By anyhow, back to your question... when we set UnicodeEnabled, we get data from the clipboard differently like with this code... that's about it:

if ((context.SyntaxEditor.UnicodeEnabled) && (clipboardData.GetDataPresent(DataFormats.UnicodeText)))
    text = clipboardData.GetData(DataFormats.UnicodeText) as string;
else if (clipboardData.GetDataPresent(DataFormats.Text))
    text = clipboardData.GetData(DataFormats.Text) as string;

.NET stores characters in Unicode by default so that's all you really need to do.

Actipro Software Support

Posted 19 years ago by Karl Grambow

Thanks for the reply. As I suspected, it's simple enough. But I still cannot get it working with the standard rtb. It could be something to do with the fonts - even though I'm using Courier New on SytnaxEditor and the rtb, but it seems that the rtb needs to change the font to Arial MS Unicode when displaying unicode characters.

Anyway, I'm not going to stress over this anymore.

About version 4.0 :). I'll be banging your door down to get it! Any idea when you might start beta testing it (sorry if I sound impatient)?

I'm particularly interested in using a SyntaxEditor with all the features turned of for maximum efficiency so that it is fast and can handle lots of data, potentially hundreds of thousands of rows (or even millions, theoretically).

In the meantime, is there anything I can do with the current SyntaxEditor to make it faster (even if it's just a bit faster)? If not, I guess I'll just have to wait.

Thanks,

Karl

Posted 19 years ago by Actipro Software Support - Cleveland, OH, USA

We're working hard on SE 4.0 but not sure of a due date yet, even for beta. What we have is looking really nice so far though. It's going to be a huge update.

With 3.1, there isn't much you can do to speed things up. Maybe make a language that marks the entire file in one token. In 4.0, we're reworking every bit of code so that if a feature isn't used, it doesn't run unnecessary code that supports that feature. This has really helped speed things up. Also we have options in 4.0 for turning off lexical and semantic parsing, which speeds things up as well.

Actipro Software Support

The latest build of this product (v25.1.0) was released 1 month ago, which was after the last post in this thread.

Comments (3)

Add Comment