Posted 19 years ago
by Actipro Software Support
- Cleveland, OH, USA
One of the major new features in SyntaxEditor 4.0 is to be able to make programmatic lexical parsers for those developers who are advanced and want to optimize parsing speed. We've mocked up a simple programmatic lexical parser for C# that is optimized for C# text and can parse text in a fraction of the time of a SyntaxEditor 3.1 C# language definition.
Let's talk about the implementation. The SyntaxLanguage is now an abstract class that defines the base requirements for any language. Our mock programmatic C# language definition is in a class named CSharpSyntaxLanguage which inherits SyntaxLanguage. SyntaxLanguage defines a new lexical parsing method and in CSharpSyntaxLanguage, we override that method to use our programmatic lexical parser for C#. That lexical parser creates CSharpToken objects that are inserted into the Document's Tokens collection. And there is a CSharpTokenID enumeration that defines each keyword, operator, and other language element. Each CSharpToken has an ID property which correlates to a CSharpTokenID value.
So whereas in the SyntaxEditor 3.1 C# language definition, all keyword tokens had a key of KeywordToken, in SyntaxEditor 4.0 using this specialized CSharpSyntaxLanguage, each individual keyword has its own unique ID. This will make it easy to write semantic parsers that can build an object model and then support Intellisense features.
For those of you who like the 3.1-style languages, we are still going to support the 3.1 style languages via an implementation of SyntaxLanguage. We aren't sure yet what to call that implementation class and are hoping you can vote which name you prefer by replying here or coming up with another name you like.
Some names that we came up with for the 3.1-style language implementation are:
DynamicSyntaxLanguage
XmlDefinitionSyntaxLanguage
GenericSyntaxLanguage
ClassicSyntaxLanguage
Which do you like best? We like the first two best.
3.1-style languages are nice because they are easily customizable whereas the languages with programmatic implementations of lexical parsers will be hardcoded and rigid, but very fast.
If you have any other questions about some changes/enhancements for languages and tokens, post them here.
Let's talk about the implementation. The SyntaxLanguage is now an abstract class that defines the base requirements for any language. Our mock programmatic C# language definition is in a class named CSharpSyntaxLanguage which inherits SyntaxLanguage. SyntaxLanguage defines a new lexical parsing method and in CSharpSyntaxLanguage, we override that method to use our programmatic lexical parser for C#. That lexical parser creates CSharpToken objects that are inserted into the Document's Tokens collection. And there is a CSharpTokenID enumeration that defines each keyword, operator, and other language element. Each CSharpToken has an ID property which correlates to a CSharpTokenID value.
So whereas in the SyntaxEditor 3.1 C# language definition, all keyword tokens had a key of KeywordToken, in SyntaxEditor 4.0 using this specialized CSharpSyntaxLanguage, each individual keyword has its own unique ID. This will make it easy to write semantic parsers that can build an object model and then support Intellisense features.
For those of you who like the 3.1-style languages, we are still going to support the 3.1 style languages via an implementation of SyntaxLanguage. We aren't sure yet what to call that implementation class and are hoping you can vote which name you prefer by replying here or coming up with another name you like.
Some names that we came up with for the 3.1-style language implementation are:
DynamicSyntaxLanguage
XmlDefinitionSyntaxLanguage
GenericSyntaxLanguage
ClassicSyntaxLanguage
Which do you like best? We like the first two best.
3.1-style languages are nice because they are easily customizable whereas the languages with programmatic implementations of lexical parsers will be hardcoded and rigid, but very fast.
If you have any other questions about some changes/enhancements for languages and tokens, post them here.
Actipro Software Support