language merging problem

SyntaxEditor for Windows Forms Forum

Posted 15 years ago by David T. Boutilier
Version: 4.0.0280
Avatar
I have a working dynamic language definition for an existing lisp-like language that is in use here. The language allows the definition of rules, one per line, in the source file. Certain rules can be followed by additional, multi-line active content which is distinguishable only by its not being either a comment or a rule definition. While the end of this content is clearly marked by a token, the start is not. So, the only available token is the end-of-line mark.

I have been trying to extract the the portion of the language that deals with the structure of the addtional content, using a direct lexical state transition. When I parse a file that exercises the state transition, I get unexpected recursion on the PerformSemanticParse function, which seems to be triggered by the PatternValue matching the newline. This faults rather quickly on a stack overflow.

I have tried to avoid having to match the newline results in the recognition of the StartScope token. I have not found anything that works.

This all works fine until I try to extract the language for the additional content into a separate file.

Any suggestions?

Comments (3)

Posted 15 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Hi David,

It's tough to say without seeing it happen in person. If you'd like, maybe put together a simple sample project that shows it and email it over.

Trim all the parts of your language definition down so that only the smallest amount of language definition needed to show the issue is included.

Thanks.


Actipro Software Support

Posted 15 years ago by David T. Boutilier
Avatar
I think that would be a lot of work, and I am out of time for this exploratory project.

I have learned some more about the problem, which I will share. I have been able to reproduce the problem with a new, distinct token in place of newline. Since this rules out special behavior on the part of newline, it makes it increasingly likely that the problem lies with either my dynamic lexer definition for the embedded language, or with the language.cs file that implements PerformSemanticParse(MergableLexicalParserManager manager).

I have also learned that the recursion is occurring in AdvanceToNext(), when the start token match that would cause the transition to the embedded language would become the next token. Since that routine is called within PerformSemanticParse, the pathological nature of the recursion is obvious.

If the opportunity arises to resume this task, I will have to start with a simpler approach, defining some minimalist pair of languages and trying to build up a working installation that can be enhanced for re-integration into the larger project.

It would be very helpful if a better, more complete tutorial were available to explain the relationships between the mergable lexers and parsers.
Posted 15 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
David,

Not sure if it will help but AdvanceToNext indirectly calls this code which gets the next token:
public IToken GetNextToken() {
    reader.PopAll();

    currentToken = lookAheadToken;
    currentTokenLineIndex = reader.LineIndex;
    lookAheadToken = this.GetNextTokenCore();

    // If starting a direct lexical state transition, skip over the token (like <% delimiter)
    if ((lookAheadToken.LexicalState != null) && (lookAheadToken.LexicalState.LexicalStateTransitionLexicalState != null))
        lookAheadToken = this.GetNextTokenCore();

    // Skip over child languages if the look-ahead is the start of one...
    if (lookAheadToken.HasFlag(LexicalParseFlags.LanguageStart)) {
        lookAheadToken = this.ProcessChildLanguage();
        
        // If ending a direct lexical state transition, skip over the token (like %> delimiter)
        if ((lookAheadToken.LexicalState != null) && (lookAheadToken.LexicalState.LexicalStateTransitionLexicalState != null))
            lookAheadToken = this.GetNextTokenCore();
    }

    return currentToken;
}
That code is in RecursiveDescentLexicalParser.


Actipro Software Support

The latest build of this product (v24.1.0) was released 1 month ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.