Get real offset depending on line terminator

SyntaxEditor for Windows Forms Forum

Posted 13 years ago by Tobias Lingemann - Software Devolpment Engineer, Vector Informatik GmbH
Version: 4.0.0289
Avatar
Hi,

I need to provide my compiler with the current offset.
Since SyntaxEditor only stores newline as line terminator, the offset I get does not consider any carriage returns and my compiler gets a wrong offset.
For example the offset after "Foo\r\nBar" is 8 but because of SyntaxEditor's internal representation I get a value of 7.
Is there a way to translate the internal offset to the real offset?

And DocumentPosition only contains the line number, but not the offset from the beginning of the line. If I could get that, I could work around the issue.


[Modified at 09/07/2011 08:02 AM]


Best regards, Tobias Lingemann.

Comments (5)

Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Hi Tobias,

Actually DocumentPosition does also indicate the Character, which is the offset relative to the line's start.


Actipro Software Support

Posted 13 years ago by Tobias Lingemann - Software Devolpment Engineer, Vector Informatik GmbH
Avatar
Okay, this helps. I might have a problem with unicode characters, I got to check that out.
Again, the other way (to translate the internal offest to the "real" offset) is not possible?

However I think the documentation and name is misleading. The documentation says "The character of this DocumentPosition", so I assumed it is the actual character with an integer representation (like 41 for A) and not the offset.


Best regards, Tobias Lingemann.

Posted 13 years ago by Tobias Lingemann - Software Devolpment Engineer, Vector Informatik GmbH
Avatar
The same thing applies for unicode character with more than 1 byte too, right?
Simply put your offset counter only counts the visible characters and not the number of bytes in the text file.
So I need a second counter in my compiler that counts the number of visible characters the way you do (at the moment he only knows the number of bytes).


Best regards, Tobias Lingemann.

Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Hi Tobias,

Well if you know all your real line terminators end with \r\n, you could take our offset and add the line index to it to get the "real" offset in the source file.

For instance with this file, where the | is the caret and the target position...
A
B
C|

It's our offset 6 and line index 2. So the "real" offset if this text had \r\n ends would be 6 + 2 = 8.

The DocumentPosition is a line/character index pair, similar to the information you see in the Visual Studio status bar where it indicates Ln (Line) and Ch (Character).

Once any file is loaded into a .NET string, you can no longer track bytes since strings are Unicode. So all our offsets and text positions are relative to Unicode character counts. It's up to the encoding you load and save with to determine how to translate that from/to bytes.


Actipro Software Support

Posted 13 years ago by Tobias Lingemann - Software Devolpment Engineer, Vector Informatik GmbH
Avatar
Thanks for your answer.
Unfortunately we cannot guarantee the line terminators are \r\n unless we save the file with the correct encoding before we load it.
Our compiler currently works with the byte position and we don't want to parse the current line to look for characters with multiple bytes (mainly asian characters) to get the byte-offset. So we will implement a second counter in our compiler that will return the same offset as SyntaxEditor does.

[Modified at 09/09/2011 12:43 AM]


Best regards, Tobias Lingemann.

The latest build of this product (v24.1.1) was released 3 months ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.