Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
What is everyone's opinion on line terminators? Right now for ease/speed of parsing, when a file is loaded into SyntaxEditor, it converts all line feeds internally to \n only. You can export the text back out using any standard line terminator combination via Document.GetText, etc.

Do you like how this is done or do you think that line terminators should be preserved as they were loaded. VS preserves them however the downside is that it makes parsing a lot more complex because now in many cases we have to look for \r or \n or \r\n. Just the fact that it could be one or two chars makes it much more complex.

I've generally been under the opinion that converting to \n only makes everything much more straightforward and when you get Document.Text, it returns the line terminators to \r\n for you so it's no big deal. Knowing that line terminators are a single \n only when parsing does help speed up code.


Actipro Software Support

Comments (9)

Posted 16 years ago by Kelly Leahy - Software Architect, Milliman
Avatar
I personally like the way this is currently handled. I was forced to deal with the parsing difficulties of line terminators when building my lexer, since my lexer is used both 'within' and 'outside of' the syntax editor framework (I have a shared lexer that I use for internal purposes as well as providing the syntax editor access to in order to do UI work).

As for how it works in the editor itself, I'd say that it's nice being able to ensure that all line termination is normalized to one of the three forms. I wouldn't mind if it weren't, so long as there were a good way to normalize quickly with the help of the syntax editor - I hate writing that code myself :)

I do think that the current design of the syntax editor is not clearly documented on this particular question - it took me a while to figure out why all my adornments kept showing up in the wrong places - since I was using my "external" parse offsets instead of those in SemanticParseData on the syntax editor for some things and they didn't match :)

So, I guess what I'm saying is I don't really care, but I'd say do whatever is best for you to deliver it quickly. I can't think of any reason why the files need to preserve exactly the same line endings that they currently have - it seems like the application can define the line endings easily enough when saving and let SyntaxEditor do what it wants to with them (just as it does now). To me, it seems like a bad design if you need to care what type of line endings were in the file originally and preserve nonsensical mixtures of them on output - since in any case the editor can only 'create' one kind.

Kelly Leahy Software Architect Milliman, USA

Posted 16 years ago by Gareth - Director, Slyce Software Limited
Avatar
There are many advantages to using the \n approach, and no disadvantages as far as I can see. I do a lot of text processing in my application as well, and I've also taken this approach. Speed if of absolute prime importance for a parser, so unless someone has a VERY good reason not to go this route I'd say this is a no-brainer - stick with the \n linebreaks.
Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Good points Kelly. I should also mention that I don't believe we've ever had a customer tell us they wanted line terminator preservation so I think it's about as low as can be on the desired feature radar.


Actipro Software Support

Posted 16 years ago by Joseph Albahari
Avatar
The only reason to preserve /r/n characters literally might be if a document was inconsistent in its use of end-of-line markers. I can't see any valid reason why a document would want to be deliberately inconsistent - so there's no loss in normalizing them.
Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
It seems like everyone is in agreement on this issue, just keep it like it is in SyntaxEditor today... normalized and fast.


Actipro Software Support

Posted 16 years ago by tobias weltner
Avatar
agree too
-Tobias
Posted 16 years ago by Eric J. Smith - CodeSmith Tools, LLC
Avatar
I guess my only question would be... is converting line endings when you open a file causing you to scan the entire file before working with it? Is this an issue in large files? Personally, I like it the way it is, but just wondering on the one thing.
Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
No I don't believe the upfront hit for this conversion is a factor in performance at all.


Actipro Software Support

Posted 16 years ago by Matt Whitfield
Avatar
\n is the one - for definite.