nested multiline comments in SQL?

SyntaxEditor for Windows Forms Forum

Posted 18 years ago by Karl Grambow
Version: 4.0.0245
Avatar
Hi,

I'm using SyntaxEditor with the supplied SQL language and I'm trying to allow for nested multiline comments. I'll demonstrate with an example.

/* --opening tag number 1

this is a comment

/* --opening tag number 2


this is a nested comment


*/ --problem is that this Closing tag actually closes the very first opening tag

*/ -- as a result this closing tag doesn't have a corresponding opening tag.


Effectively, what I was is to have everything between opening tag number 1 and it's corresponding closing tag commented out regardless of how many nested opening/closing tags exist in between.

Is this possible?

Thanks,

Karl

Comments (6)

Posted 18 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
You might be able to do that if you have the comment state have a child state of itself so that recursion occurs.


Actipro Software Support

Posted 18 years ago by Kelly Leahy - Software Architect, Milliman
Avatar
Karl,

You really sure you want to do that? Reason I ask is that I've never seen another language that supports nested comments - I don't think most SQL implementations do either.

Are you using a dynamic language? if not, it's easy... If you are, I can't help you (but somebody else will).

To handle this in a non-dynamic language, you handle it with a stack or "counter" for the multiline comment lexer routine.

For instance, let's say your lexer looks like (from SimpleLanguage example):

public MatchType GetNextTokenLexicalParseData(ITextBufferReader reader, ILexicalState lexicalState, ref ITokenLexicalParseData lexicalParseData) {
    // Initialize
    int tokenID = SimpleTokenID.Invalid;

    // Get the next character
    char ch = reader.Read();

    // If the character is a letter or digit...
    if ((Char.IsLetter(ch) || (ch == '_'))) {
        // Parse the identifier
        tokenID = this.ParseIdentifier(reader, ch);
    }
    else if ((ch != '\n') && (Char.IsWhiteSpace(ch))) {
        while ((reader.Peek() != '\n') && (Char.IsWhiteSpace(reader.Peek()))) 
            reader.Read();
        tokenID = SimpleTokenID.Whitespace;
    }
    else {
        tokenID = SimpleTokenID.Invalid;
        switch (ch) {
            case ',':
                tokenID = SimpleTokenID.Comma;
                break;
            case '(':
                tokenID = SimpleTokenID.OpenParenthesis;
                break;
            case ')':
                tokenID = SimpleTokenID.CloseParenthesis;
                break;
            case ';':
                tokenID = SimpleTokenID.SemiColon;
                break;
            case '\n':
                // Line terminator
                tokenID = SimpleTokenID.LineTerminator;
                break;
            case '{':
                tokenID = SimpleTokenID.OpenCurlyBrace;
                break;
            case '}':
                tokenID = SimpleTokenID.CloseCurlyBrace;
                break;
            case '/':                        
                tokenID = SimpleTokenID.Division;
                switch (reader.Peek()) {
                    case '/':
                        // Parse a single-line comment
                        tokenID = this.ParseSingleLineComment(reader);
                        break;
                    case '*':
                        // Parse a multi-line comment
                        tokenID = this.ParseMultiLineComment(reader);
                        break;
                }
                break;
            case '=':
                if (reader.Peek() == '=') {
                    reader.Read();
                    tokenID = SimpleTokenID.Equality;
                }
                else
                    tokenID = SimpleTokenID.Assignment;
                break;
            case '!':
                if (reader.Peek() == '=') {
                    reader.Read();
                    tokenID = SimpleTokenID.Inequality;
                }
                break;
            case '+':
                tokenID = SimpleTokenID.Addition;
                break;
            case '-':
                tokenID = SimpleTokenID.Subtraction;
                break;
            case '*':
                tokenID = SimpleTokenID.Multiplication;
                break;
            default:
                if ((ch >= '0') && (ch <= '9')) {
                    // Parse the number
                    tokenID = this.ParseNumber(reader, ch);
                }
                break;
        }
    }

    if (tokenID != SimpleTokenID.Invalid) {
        lexicalParseData = new LexicalStateAndIDTokenLexicalParseData(lexicalState, (byte)tokenID);
        return MatchType.ExactMatch;
    }
    else {
        reader.ReadReverse();
        return MatchType.NoMatch;
    }
}
with the "this.ParseMultiLineComment(ITextBufferReader reader)" implemented as

protected virtual int ParseMultiLineComment(ITextBufferReader reader) {
    reader.Read();
    while (reader.Offset < reader.Length) {
        if (reader.Peek() == '*') {
            if (reader.Offset + 1 < reader.Length) {
                if (reader.Peek(2) == '/') {
                    reader.Read();
                    reader.Read();
                    break;
                }
            }
            else {
                reader.Read();
                break;
            }
        }
        reader.Read();
    }
    return SimpleTokenID.MultiLineComment;
}
You can change your lexer to handle it by making ParseMultiLineComment a bit smarter...

protected virtual int ParseMultiLineComment(ITextBufferReader reader) {
    // keep track of depth...
    int depth = 1;

    // consume the opening *
    reader.Read();
    while (!reader.IsAtEnd)
    {
        char ch = reader.Peek();
        if (ch == '/')
        {
            // always consume the char (we need progress in any case)
            reader.Read();
            // don't read past EOF (assume they haven't finished the comment yet)
            if (reader.IsAtEnd)
                return SimpleTokenID.MultiLineComment;
            // look for another nested comment
            if (reader.Peek() == '*')
            {
                // consume the *
                reader.Read();
                // we're one deeper now.
                depth++;
            }
        }
        else if (ch == '*')
        {
            // always consume the char (we need progress in any case)
            reader.Read();
            // don't read past EOF (assume they haven't finished the comment yet)
            if (reader.IsAtEnd)
                return SimpleTokenID.MultiLineComment;
            // look for a close comment
            if (reader.Peek() == '/')
            {
                // consume the '/'
                reader.Read();
                // we're one shallower now.
                depth--;
                // if we are back to zero, we've read the entire multiline nested comment.
                if (depth == 0)
                    return SimpleTokenID.MultiLineComment;
            }
        }
        else
            reader.Read();
    }
    return SimpleTokenID.MultiLineComment;
}
This code is untested, but I'm pretty sure it'll work properly... It may need a few simple changes though - feel free to ask if you run into trouble with it.

[Modified at 03/09/2007 03:36 PM]

Kelly Leahy Software Architect Milliman, USA

Posted 18 years ago by Kelly Leahy - Software Architect, Milliman
Avatar
Ok... I just tested that code in the demo SDI editor application.

It works.

Kelly Leahy Software Architect Milliman, USA

Posted 18 years ago by Karl Grambow
Avatar
Hi Kelly,

Thank you so much for the effort spent in providing the example and in replying to my post. Unfortunately I'm not using the lexer. Effectively I'm just loading the language from XML and that's about it.

You are right, it is a bit of a weird implementation and the only place I've seen it used is in SQL Server Management Studio for SQL Server 2005. It's not essential but I was just curious if it was possible (easily so) in SyntaxEditor.

Regarding Actipro Support's initial response. I think that the comment state already has a child state of itself as this is how it comes in the supplied SQL language definition file.

Here's part of the definition file.



<!-- Code -->
<State Key="DefaultState">

............

    <ChildStates>
        <ChildState Key="MultiLineCommentState" />
    </ChildStates>
</State>

<!-- MultiLine Comments -->
<State Key="MultiLineCommentState" TokenKey="MultiLineCommentDefaultToken" Style="CommentDefaultStyle">
<!-- Scopes -->
<Scopes>
    <Scope BracketHighlight="True">
        <ExplicitPatternGroup Type="StartScope" TokenKey="MultiLineCommentStartToken" Style="CommentDelimiterStyle" PatternValue="/*" />
        <ExplicitPatternGroup Type="EndScope" TokenKey="MultiLineCommentEndToken" Style="CommentDelimiterStyle" PatternValue="*/" />    
    </Scope>
</Scopes>



Posted 18 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Hi Karl,

I meant to add the MultiLineCommentState as a child state of MultiLineCommentState. Then also you need to configure MultiLineCommentDefaultToken to break on /. If you do that it works:

<State Key="MultiLineCommentState" TokenKey="MultiLineCommentDefaultToken" Style="CommentDefaultStyle">
    <!-- Scopes -->
    <Scopes>
        <Scope BracketHighlight="True">
            <ExplicitPatternGroup Type="StartScope" TokenKey="MultiLineCommentStartToken" Style="CommentDelimiterStyle" PatternValue="/*" />
            <ExplicitPatternGroup Type="EndScope" TokenKey="MultiLineCommentEndToken" Style="CommentDelimiterStyle" PatternValue="*/" />    
        </Scope>
    </Scopes>
    <!-- Patterns Groups -->
    <PatternGroups>
        <RegexPatternGroup TokenKey="MultiLineCommentDefaultToken" PatternValue="[^\*\/]+" />
    </PatternGroups>
    <ChildStates>
        <ChildState Key="MultiLineCommentState" />
    </ChildStates>
</State>


Actipro Software Support

Posted 18 years ago by Karl Grambow
Avatar
That's perfect!

I had tried adding the child state to the MultiLineCommentState but it didn't work - until I configured MultiLineCommentDefaultToken to break on /, as you suggested.

Thanks a lot,

Karl
The latest build of this product (v24.1.1) was released 4 days ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.