End of line character problem

SyntaxEditor for WPF Forum

Posted 2 years ago by SPF - Appeon
Version: 22.1.1
Avatar

In our syntaxeditor, we will recognize   \r\n  as one character length, but our replacement data is to recognize \r \n as two character lengths. This leads to errors in text replacement and other operations.

How to solve this situation.

[Modified 2 years ago]

Comments (8)

Posted 2 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hello,

Yes for performance reasons we normalize internally line terminators to a single \n character, but can output them again later via text methods like GetSubstring or GetText as any kind of line terminator (\r\n, \r, or \n).  

Can you provide some more detail on your scenario?  Then we can provide some guidance.


Actipro Software Support

Posted 2 years ago by SPF - Appeon
Avatar

1. I need to get the offset and position when "\r\n" is two characters. What should I do?

2. I have a text replacement range of two characters based on "\r\n", which needs to be replaced. What do I need to do?

Posted 2 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

If you have line/character positions, those will still be the same regardless of line terminator kind.  Therefore, getting to a position-relative way of tracking characters or ranges is the best thing to do.

Translating positions to original source-relative offsets would need a minor delta value.  If you know the line/character position, and you know that your original source always has \r\n line terminators, you could use our normal PositionToOffset method calls, but that offset will be relative to a single \n line terminator.  You know the difference in offsets would be 1 for each line, so to get an original source-relative offset from a position, you could do:

PositionToOffset(position) + position.Line

The hardest part is going from an original source-relative offset to a position.  For that you might need to scan the text yourself and store the original source-relative offsets of where each line starts in a list.  Then use a binary search mechanism to take in an original source-relative offset, find the list index at which it starts (which is the position's Line) and subtract the list item value from the original source-relative offset being searched to find the position's Character.


Actipro Software Support

Posted 2 years ago by Sunshine - Appeon
Avatar

We also encountered a similar problem, see your explanation The reason why Editor replaces all behavior characters with \n is for internal performance optimization.

But the last mentioned conversion of a position by an offset will be a huge performance penalty for large text documents.

Part of our feature is full C# code editing support,and we use a third-party analysis engine for syntax analysis.Some interfaces in the .NET parser (Roslyn) use offsets as parameters, but some also use positions.

So inevitably need to call the conversion method on the Editor, but the current situation makes it difficult for us to accept. Dealing with line endings should be the responsibility of the Editor, but we need to do additional processing, and the most important thing is the performance loss caused by the offset to the position.

For the user, I should only care about loading, saving, and manipulating the document.However, because SyntaxEditor does not handle the difference of line endings, the data obtained directly through the calling interface may be inconsistent with the real data, and it needs to be processed again by the decorator.

This is so unreasonable!

Posted 2 years ago by Sunshine - Appeon
Avatar

We tried to resolve this by unifying all dummy document line endings to \n, including parser's dummy document sync. However, there will be a .editorconfig file in the C# project to control the code style, including the setting of line endings. At this time, the line endings of the internal document of the analyzer and the document of the actual file system are inconsistent, which may cause unexpected results such as code style mismatch in the analyzer.

Posted 2 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hello,

We agree that properly tracking original line terminators are a good enhancement idea for the product, and would make things a lot easier for scenarios like yours.  We are adding a TODO item for tracking line terminator kinds per line with some initial thoughts.  Any changes in this area will require a lot more research on impact because there are a lot of areas in the product where we assume a single '\n' line terminator for lexing speed.


Actipro Software Support

Posted 2 years ago by SPF - Appeon
Avatar

Is there any progress in this matter? At present, we are greatly affected by this problem.

Posted 2 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hello,

I'm sorry but we are working in other areas and haven't started on it.  The WPF SyntaxEditor has been this way for ~15 years.  Changing it isn't something that can be easily done, and is probably something that would have to be done for a new major version due to possible breaking changes.


Actipro Software Support

The latest build of this product (v24.1.1) was released 1 month ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.