
I'm very close to having my generic lexer/filter/parser/AST combination working that will be used for my syntax-language derived implementation of my language (and my command line compiler).
However, I'm puzzled about where the DocumentEnd token should come from. Is this normally to be generated by my lexer / filter, or is this somehow "magically" generated by the syntax editor? Is there a harm in me returning a DocumentEnd token at the end of my document, should the lexer's GetNextToken be called past the last token in the string? I seem to remember that your example implementation didn't return a DocumentEnd token, if I remember correctly.
The thing is, I'd like to use my lexer/filter/parser inside and outside of the syntax editor, and in order to do so, I need to return a documentEnd token, or I need to NOT rely on it in my parser. Of course, if I don't rely on it in my parser, it's hard to ensure that there isn't the "subset" problem where my parser will recognize (as valid) a subprogram without noticing that there is invalid text after the end of the program text. Some languages consider this acceptable (Pascal comes to mind), but in my case it would be wholly unacceptable.
Thanks,
However, I'm puzzled about where the DocumentEnd token should come from. Is this normally to be generated by my lexer / filter, or is this somehow "magically" generated by the syntax editor? Is there a harm in me returning a DocumentEnd token at the end of my document, should the lexer's GetNextToken be called past the last token in the string? I seem to remember that your example implementation didn't return a DocumentEnd token, if I remember correctly.
The thing is, I'd like to use my lexer/filter/parser inside and outside of the syntax editor, and in order to do so, I need to return a documentEnd token, or I need to NOT rely on it in my parser. Of course, if I don't rely on it in my parser, it's hard to ensure that there isn't the "subset" problem where my parser will recognize (as valid) a subprogram without noticing that there is invalid text after the end of the program text. Some languages consider this acceptable (Pascal comes to mind), but in my case it would be wholly unacceptable.
Thanks,
Kelly Leahy Software Architect Milliman, USA