How to set the grammar for the syntax editor control?

Comments (11)

Posted 16 years ago by Light

Ok I sort of found this, but how to I create a grammar file now? I looked at the Simle.Grammar.xml but not sure if it contains code or just the specs to define the grammar?

Any info on this?

Posted 16 years ago by Light

I was expecting something like what scite editor where there are fields that define keywords, identifiers, etc and what font to use for each, etc.

Where can I define stuff like these? So I can describe keywords to be highlighted like for, foreach, while, do, then etc?

If I can highlight the code correctly that would be a very good start.

Thanks,
Light

Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA

The syntax highlighting is driven by a lexical parser, which is separate from the grammar. Grammar is used to create a semantic parser, which is something that can be added to your language to generate an AST, etc. later on.

To get started with lexical parsing, please take a look at our C# sample project's Languages\Dynamic\Lexers folder. Those XML files are our dynamic language XML definitions that are free samples of how to achieve highlighting. Some of them have code-behind files that are included in the parent folder. The XML definitions include numerous samples of how to make syntax highlighting for various languages and there is documentation on the format in the help file.

We're currently working on our next generation framework for SyntaxEditor (which is being prototyped with our WPF version) and in that we're doing a lot of work to make it very easy for new customers to get going building a language. More info should be coming this week to our blog on that, but again it will only apply to our WPF version until that new framework is complete and we can port it back to WinForms.

Actipro Software Support

Posted 16 years ago by Light

Thanks for your reply.

Ok I will use the samples in that folder to create my version.

After this do you know how can I plug this into the editor so it knows that file is used to color code the contents?

I have seen a method called, LoadLanguageFromXml. Is that the way?

Is there a way to assign it like to a property so that lexer is used for color coding everything that's in that editor? LoadLanguageFromXml seems like it's for one time only?

Also do you know when the next version of SE would be available for winforms?

Thanks again,
Light

Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA

Document.LoadLanguageFromXml is how you can load a language specifically for that document from an XML definition. Alternatively you could do the static DynamicSyntaxLanguage.LoadFromXml method. That will return a DynamicSyntaxLanguage.

You can reuse language instances among multiple documents too where the same language is being edited, allowing you to save on memory. So once you have aSyntaxLanguage instance for your language, you'd set all your SyntaxEditor.Document.Language properties to be that language instance.

We are doing periodic maintenance releases for WinForms, but cannot update it to the next gen framework until the next gen framework has all the feature areas implemented that are found in WinForms right now. That will be months away yet.

Actipro Software Support

Posted 16 years ago by Light

Thanks again for replying.

I have another question regarding the semantic parser. When you provide an xml for this, does SyntaxEditor automatically parses these and provides an AST?

I am just wondering if it will return thing in a way where I can inspect their members? As in:

object Geometry
(
VertexCount,
FaceCount,

fn Render =
(
...
)
)

will become a node where I can use .NET Reflection on it?

I am still on the highlighting but this would take me further when I am done.

Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA

The Grammar Designer can be used to edit the files like Simple.Grammar.xml that you found. I believe Simple.Grammar.xml will generate specific AST node classes based on the ones declared in that grammar file. Note that your non-terminal production code must also define where the AST nodes are instantiated and how they are added to parent nodes.

Actipro Software Support

Posted 16 years ago by Light

Ok I set up the lexer file, and it works great.

Now working on the grammar file, so looking at the simple.grammar.xml.

Is the parser parses the right info based on the identifiers you use? As in:

            <AstNode Name="Identifier" Description="An identifier.">
                <AstNodeProperty PropertyType="Simple" Name="Text" Type="System.String" Description="The text of the qualified identifier." />
                <AstNodeDeclarations Type="Constructor"><![CDATA[    

                    /// <summary>
                    /// Initializes a new instance of the <c>Identifier</c> class. 
                    /// </summary>
                    /// <param name="text">The text of the qualified identifier.</param>
                    /// <param name="textRange">The <see cref="TextRange"/> of the AST node.</param>
                    public Identifier(string text, TextRange textRange) : this(textRange) {
                        // Initialize parameters
                        this.text = text;
                    }

                ]]></AstNodeDeclarations>

Here it just looks for a similar format to parse Identifiers? It doesn't look the same as the lexer xml where you had rules to use regex or exact match, etc. Surely this requires more functionality, just wondering where I can get more info on how to set up mine?

Also when I have a grammar file, how should I load it? In your grammar c# aexample (2nd app), you have 2 editors. But for my case, there is gonna be a single editor.

I don't think the idea is to replace the editor's content with the parsed data, but set a property so there is continuous real-time parsing, right?

I used this code from your example, for testing:

            RecursiveDescentSemanticParserGenerator parserGenerator = new RecursiveDescentSemanticParserGenerator ( );

            string file2 = @"C:\Program Files (x86)\Actipro Software\WindowsForms\SyntaxEditor\v4.0.0282\TestApplication-CSharp.VS2008\Languages\SimpleAddon\ActiproSoftware.Simple.Grammar.xml";

            RecursiveDescentSemanticParserGeneratorOutput output = parserGenerator.Generate ( file2 );

            this.editor1.Document.Text = output.ToString ( );

This crashes.

Even with an xml file for another language, it shouldn't crash, right?

[Modified at 10/06/2009 08:38 PM]

Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA

Keep in mind that lexical parsing is required for semantic parsing. The lexical parser is what you can use a dynamic language XML file to load, based on info previously discussed in this thread. The lexical parser is able to scan text using the patterns you define and it tokenizes text.

Now if you wish to layer semantic parsing on top of that, that is where you use the grammar file. You build your grammar file, then open it in our Grammar Designer, click Run Parser Generator, and then save the generated C#/VB code to disk. Then add it to your project. The generated code will consist of a semantic parser class, token/lexical state ID classes, and possibly AST classes.

Note that the grammar XML is ONLY used to feed into the Grammar Designer. The Grammar Designer is a code generator and you include the output from that in your project.

The other class you need is something like the SimpleRecursiveDescentLexicalParser class that provides a bridge between your language's lexical parser and the generated semantic parser. Then finally if you scan through SimpleSyntaxLanguage for the text "Semantic" you'll see the other couple of code snippets you need to bring it all together in the language.

Actipro Software Support

Posted 16 years ago by Light

Thanks for replying. I am adding a few more stuff to the lexer xml, but have a questions:

1. How to color strings for each side separately that are in this form:

argument:value

Is this better handled in the semantic parser and not the lexer xml?

2. Another one that has the same issue is, for event handlers that follow this format:

on ControlName EventName do ...

on and do are already colored because they are language keywords. But I don't want to use straight string matching for the EventName as it can be a variable or something else in code. Can you please give me an example how this would be achieved?

Also is this better handled in the semantic parser and not the lexer xml? From my readings of the docs, it seems the semantic parser might be better?

3. For the lexer xml, what's the maximum threshold for the number of words beforethe performance of the highlighter takes a significant hit? I want to add some more tokens that might be useful, that are specific to the application the language belongs to. It could be useful to highlight everything that are of some type that have the same common parent class, etc. So if it was Photoshop, this would be: GaussianBlur, Sharpen, Soften, Outline, Liquify, ... that are of type Filter.

Am I better of adding these to the lexer xml? Or should I handle it differently? Because they will not necessarily be highlighted by default. Only if the user turns on the appropriate settings, then these things that I have predefined list of will be highlighted.

Also for additional information, the number of these words would be a little under 1000 words in case that matters for either scenario.

Thanks again,
Light

Posted 16 years ago by Actipro Software Support - Cleveland, OH, USA

Light,

1) Well you could do a regular identifier-like pattern scan and then also add a look-ahead pattern that looks for a colon. That way you could color your argument a different color than the value. Similarly the value could have a look-behind pattern for colon.

2) Contextually-colored keywords are something that is pretty tricky to do. The easiest way to do things is via creative use of look-behind and look-ahead patterns so that you keep it all in the lexer definition. If you are using a dynamic language then each DynamicToken does have a CustomHighlightingStyle property you can set however that can get tricky since it would require another phase of updating tokens. Thus it's better to just stick with the lexer to do these if at all possible.

3) You really are best experimenting with what you feel is acceptable performance. Dynamic languages do try their best to optimize themselves for lots of patterns but you may also find that you achieve better performance by using a hand-written programmatic lexer like our Simple language sample. If you are using a dynamic language, you can add/remove patterns on the fly by wrapping your language changes with a language.IsUpdating = true and related false call and modifying the language in between.

Actipro Software Support

The latest build of this product (v25.1.0) was released 1 month ago, which was after the last post in this thread.

Comments (11)

Add Comment