Merged Language Bracket problem

Posted 18 years ago by Dave Sparks

Version: 4.0.0244

First, great control! I am diving into this and amazed at how easy it is!

Now onto my question:

I have two languages that I am working on for some special script languages.

One is multi-line code with ; delimiter which is similar to C.
One is single-line (EOL delimiter) and has "code blocks" between parenthesis which is actually the first language condensed within a single line.

Example.

_val = {_a = 1; _b = 2; while {_a < _b} do {_a = _a + _x;};} foreach [1,2,3]

Note that the foreach command is part of the single-line language and the {} block is literally a code block of the first C-like language. The { } could come anywhere on the line (beginning, middle, or end). Inside could be many { } which are related to C-like language only and not the single-line language.

Problem is that parenthesis written inside the code blocks signals an EndToken prematurely. In the example above, it comes after the second _b. Somehow, I need to make sure everything adds up (start and end parenthesis) within the single-line language.

How do I do that when everything inside is part of other xml file?
Which is the best course of action to take?

My thoughts so far:
Use Regular Expression instead of Explicit Expression (of "}") for EndToken??... but not sure what...

Thanks for any advice

Comments (4)

Posted 18 years ago by Actipro Software Support - Cleveland, OH, USA

Hmm, unfortunately I think this is one of those rare scenarios where dynamic languages won't work for you. The reason is that dynamic languages don't "balance" block delimiters and it sounds like you need that since you have the } character which means one thing or another depending on if the {} blocks are balanced or not.

You would probably have to create a programmatic lexer for your languages and handle this sort of thing in your code.

But some more info on mergable langs... this is the order in which they match patterns:
1) Parent scopes (outside of the current language) to see if one ends the current state.
2) Child state start scopes to see if one starts a child state.
3) Pattern groups within the current state.
4) Parent scopes (inside of the current language) to see if one ends the current state.

Therefore if you keep both your "languages" in the same language file you can maybe try and exploit the fact that #4 happens last. In that case you would have to match the } in your child language when it was appropriate and if not, leave it unmatched so that #4 would pick it up as the parent "language" exit scope.

Actipro Software Support

Posted 18 years ago by Dave Sparks

Oh man... I really wasn't hoping to hear that! After all the work I spent putting into the dynamic language!

I like the idea of putting them both into one file. If I understand you correctly, it might work.

Just a thought... would StartBracket and EndBracket do anything? could it do anything? in the future?

---EDIT---

OK, I just merged the languages into one file... or I should say I copied the relevant portions of the first one into the second one. Now, I have the problem where a '}' will not close the sub-language scope. I assume it is because it is a "CloseCurlyBracePatternGroup" in the sub-language and that is getting recognized first.

What is the best way to handle this?

[Modified at 03/02/2007 11:11 AM]

Posted 18 years ago by Dave Sparks

Solved! By making the sub-language state a child of itself, it is recursive.

The final '}' exits out and returns to the original language default state.


    <!-- Code -->
    <State Key="DefaultState">
...
      <ChildStates>
        <ChildState Key="SubBlockState"/>
...
      </ChildStates>
    </State>

    <State Key="SubBlockState">
      <!--Scopes-->
      <Scopes>
        <Scope>
          <ExplicitPatternGroup Type="StartScope" TokenKey="BlockStartToken" Style="SubDefaultStyle" PatternValue="{" />
          <ExplicitPatternGroup Type="EndScope" TokenKey="BlockEndToken" Style="SubDefaultStyle" PatternValue="}"/>
        </Scope>
      </Scopes>
...
      <ChildStates>
        <ChildState Key="SubBlockState"/>
...
      </ChildStates>
    </State>

Thanks for the ideas. Your description of what goes on in terms of priority helped me think of this. I hope there isn;t any side-effects to doing it this way.

Also, you inadvertantly helped me in another way. Had I used the reference to the original xml file as I intitially intended and not copied it directly into the second language, my comment state would have broken the whole thing. I forgot that the "//" comments are not supported in the single-line instance of the language inside the { } block. Therefore, by maintaining two seperate files, I can tweak the second one to account for the differences by removing that comment state (and anything else I want to tweak).

Long-and-short, I now have the ability to open two types of files and everything is working great so far!

Posted 18 years ago by Actipro Software Support - Cleveland, OH, USA

Ahhh, I see... the recursion is an good idea and an interesting way of doing it. Glad it worked out!

Actipro Software Support

The latest build of this product (v25.1.0) was released 1 month ago, which was after the last post in this thread.

Comments (4)

Add Comment