dynamic language problem

Posted 17 years ago by ori

Version: 4.0.0258

Hello,

I'm using a dynamic langauge (combination of two languages with a start and end delimiters). I had a problem in my application so I did a try in the TestApplication and found the same problem.

I'm using the Python langauge defintion, as a sub language in the dynamic langauge combination. There is a problem with single-line-comments syntax highlighting.

To see the problem change the following function in the test application:


        private void CreateDirectiveXmlToCSharpLanguage() {
            // Load the two languages
            language = DynamicSyntaxLanguage.LoadFromXml(Program.DynamicLexersPath + "ActiproSoftware.XML.xml", 0);
            DynamicSyntaxLanguage cSharpLanguage = DynamicSyntaxLanguage.LoadFromXml(Program.DynamicLexersPath + "ActiproSoftware.CSharp.xml", 0);

Replcae the string "CSharp" with "Python" in the second line, and write a comment. Example: <%# this is a python comment %>

The end delimiter will not be found by the syntax highlighter.

Thanks,
Ori

Comments (14)

Posted 17 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Ori,

Actually that is correct and is working by design. What happens is that while matching a token pattern, the lexical parser doesn't stop to look for other patterns starting. To do so would really slow things down. Therefore the end %> is not found because the single line comment token in Python goes from the # all the way throught the line feed. To enable %> to be picked up, you need to break the comment token up at % characters as well. This will allow the lexical parser to look for a %> before it continues matching the comment at that point. So probably change the Python comment's main pattern group to be something like this:
<RegexPatternGroup TokenKey="CommentDefaultToken" PatternValue="[^\n%]" />

Actipro Software Support

Posted 17 years ago by ori

So why the same code, in the TestApplication, works for c# single line comments?
I really don't want to add to the Python language definition elements not in the language.

Posted 17 years ago by Actipro Software Support - Cleveland, OH, USA

The C# dynamic language's single line comment state breaks the comment up into word tokens, etc. Thus it has breaks such as what I was describing.

As a hint, you can use the SDI Editor sample to see where tokens start. When you move the caret to the start of a token, an asterisk appears next to its name. This is very useful in debugging how the lexical parser is parsing tokens.

Actipro Software Support

Posted 17 years ago by ori

OK, I'm not sure what I have done, but it is working :) I looked at the c# language definition file and replaces the following lines in the python definition:


<PatternGroups><RegexPatternGroup TokenKey="CommentDefaultToken" PatternValue="{NonLineTerminatorMacro}+" /></PatternGroups>

with the following lines:


<PatternGroups><RegexPatternGroup TokenKey="CommentDelimiterToken" Style="CommentDelimiterStyle" PatternValue="#" /></PatternGroups>

Is it good? Should it be in the original version?

Thanks,
Ori

Posted 17 years ago by Actipro Software Support - Cleveland, OH, USA

No that isn't the optimal way. Because then every character will be its own token. It is better to group words as tokens like the C# definition does. This way longer character runs are tokens.

Actipro Software Support

Posted 17 years ago by ori

If it is possible - please tell me what is the optimal definition
Thanks.

Posted 17 years ago by Actipro Software Support - Cleveland, OH, USA

I believe it was already included in our original reply.

Actipro Software Support

Posted 17 years ago by ori

OK, sorry. I wasn't sure about it.

Posted 17 years ago by ori

Hello,

I'm looking for a different solution. I have the same problem with strings.
In this example:
"this is a <% some c# code %> string"
the syntax highlighter cannot find the start and end delimiters (<% and %>)

I to declare the language and sub language and dlimiters, in a way that the different lexical states will not interrupt.

Thanks,
Ori

Posted 17 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Ori,

Please refer to our original replies in this thread as it describes how the parser works and how you need to break up long runs into smaller tokens for the delimiters to get picked up. You need to do the same thing in this case to your string state.

Actipro Software Support

Posted 17 years ago by ori

You don't understand me - I want to use the language definition file as they are. That's why I'm asking for a different solution (if there is one)

Posted 17 years ago by Actipro Software Support - Cleveland, OH, USA

Sorry but as described above, once a pattern is starting to be matched, it doesn't break for anything until it's done. So the only way to do this is to do what is mentioned in the first reply and break the pattern group up on the characters that may cause a language transition.

Actipro Software Support

Posted 17 years ago by ori

I don't really want to break the pattern. I want to add a child element - if for example I work with python and html and the string is

<A herf="<%some code in python%>"/>

I would like the the starting '"' to be matched with the ending '"'.

Posted 17 years ago by Actipro Software Support - Cleveland, OH, USA

But you have to break the pattern in order for language transition to occur. If you use a state for your attribute string then it will remain in the state at least.

Actipro Software Support

The latest build of this product (v24.1.0) was released 2 months ago, which was after the last post in this thread.

Comments (14)

Add Comment