How to implement or support nested comments

SyntaxEditor for WPF Forum

The latest build of this product (v25.1.0) was released 1 month ago, which was before this thread was created.
Posted 6 days ago by Jermaine
Version: 24.1.4
Platform: .NET 8
Environment: Windows 11 (64-bit)
Avatar

Hi guys,

I'm trying to implement ST language by using dynamic Lexer, but I've encountered a problem about nested comments could not display the correct syntax highlighting

The situation is a bit complex, but cause I've mixed the multiline comments in documentation comments. Below the sample code:

(**Comment
*
* (* sduhodhosdhc*)
* FUNCTION Main
* IF NOT WITH USING THEN
* END_FUNCTION
*
*
*
*
*//(* 
    *)
*)
(**)
CLASS AverageTests
END_CLASS
  • The (* *) block means multiline comment, starts with (*
  • The // block means single line comment
  • The (** *) blocks means documentation comment, starts with (**

And I configured my langdef file,

  • Default state contain the comments state ref, like below:
  <StateRef Key="SingleLineComment" />
  <StateRef Key="MultiLineComment" />
  <StateRef Key="DocumentationComment" />

  • Multiline and documentation states like below:
   <!-- MultiLineComment state -->
   <State Id="3" Key="MultiLineComment" TokenKey="MultiLineCommentDefaultToken" DefaultClassificationTypeKey="Comment">
     <State.ChildStates>
       <StateRef Key="MultiLineComment" />
     </State.ChildStates>
     <State.Scopes>
       <Scope>
         <Scope.StartPatternGroup>
           <ExplicitPatternGroup TokenKey="MultiLineCommentStartToken" Pattern="(*"/>
         </Scope.StartPatternGroup>
         <Scope.EndPatternGroup>
           <ExplicitPatternGroup TokenKey="MultiLineCommentEndToken" Pattern="*)" />
         </Scope.EndPatternGroup>
       </Scope>
       <Scope>
         <Scope.StartPatternGroup>
           <ExplicitPatternGroup TokenKey="MultiLineCommentStartToken" Pattern="/*"/>
         </Scope.StartPatternGroup>
         <Scope.EndPatternGroup>
           <ExplicitPatternGroup TokenKey="MultiLineCommentEndToken" Pattern="*/" />
         </Scope.EndPatternGroup>
       </Scope>
     </State.Scopes>
     <!--<RegexPatternGroup Key="MultiLineText" Pattern="[^*\n]+" />-->
     <ExplicitPatternGroup Key="MultiLineCommentDefaultToken" Pattern="[^*)]+" />
   </State>
   <!-- DocumentationComment state -->
   <State Id="4" Key="DocumentationComment" DefaultTokenId="123" DefaultTokenKey="DocumentationComment" DefaultClassificationTypeKey="DocTag">
     <State.Scopes>
       <Scope>
         <Scope.StartPatternGroup>
           <RegexPatternGroup Pattern="\(\*\*"  />
         </Scope.StartPatternGroup>
         <Scope.EndPatternGroup>
           <RegexPatternGroup Pattern="\*\)" />
         </Scope.EndPatternGroup>
       </Scope>
       <Scope>
         <Scope.StartPatternGroup>
           <RegexPatternGroup Pattern="\/\*\*" />
         </Scope.StartPatternGroup>
         <Scope.EndPatternGroup>
           <RegexPatternGroup Pattern="\*\/" />
         </Scope.EndPatternGroup>
       </Scope>
     </State.Scopes>
     <RegexPatternGroup TokenKey="DocumentationLineTerminator" Pattern="{LineTerminator} {LineTerminatorWhitespace}* \**" LookAheadPattern="[^\)\/]" />
     <RegexPatternGroup TokenKey="DocumentationTag" Pattern="@\w+" />
     <RegexPatternGroup TokenKey="DocumentationText" ClassificationTypeKey="DocComment" Pattern="[^@\n\*]+" />
   </State>

And the result seems the documentation block is treated as multline comment block.

How to fix my code.

Thanks in advanced

Comments (2)

Answer - Posted 6 days ago by Actipro Software Support - Cleveland, OH, USA
Avatar

What you are trying to accomplish is complicated for two primary reasons...

  1. Both multi-line and documentation comments start with (* ... even though one has an extra *
  2. Both multi-line and documentation comments end with the same *)

It is important to make sure that multi-line comment (* does not accidentally match on documentation comment (**.  This can primarily be achieved by making sure that any child scopes look for documentation comment first, so move that state higher than multi-line comment in the default state.

  <StateRef Key="SingleLineComment" />
  <StateRef Key="DocumentationComment" />
  <StateRef Key="MultiLineComment" />

To make sure that (**) is recognized as a multi-line comment and not a documentation comment, update the start delimiter for documentation comment to include a look ahead pattern of "[^\)]"... i.e., not close parenthesis.

Since both comment blocks end with *) you were correct to look at using nested scopes.  This allows a nested scope to "consume" the end delimiter and prevent it from closing out the parent scope.

A multi-line comment can have nested documentation comments or other nested multi-line comments.  Likewise, a documentation comment can also have other nested documentation comments or nested multi-line comments.  You can handle that through child scopes.  The following are what I used during my testing:

<!-- DocumentationComment state -->
<State Id="3" Key="DocumentationComment" DefaultTokenId="12" DefaultTokenKey="DocumentationCommentText" DefaultClassificationTypeKey="DocumentationComment">
    <State.Scopes>
        <Scope>
            <Scope.StartPatternGroup>
                <ExplicitPatternGroup TokenId="13" TokenKey="DocumentationCommentStartDelimiter" Pattern="(**" LookAheadPattern="[^\)]" />
            </Scope.StartPatternGroup>
            <Scope.EndPatternGroup>
                <ExplicitPatternGroup Key="DocumentationCommentEndDelimiter" Pattern="*)" />
            </Scope.EndPatternGroup>
        </Scope>
    </State.Scopes>
    <State.ChildStates>
        <StateRef Key="DocumentationComment" />
        <StateRef Key="MultiLineComment" />
    </State.ChildStates>
    <RegexPatternGroup TokenId="14" TokenKey="DocumentationCommentLineTerminator" Pattern="\n" />
</State>
<!-- MultiLineComment state -->
<State Id="4" Key="MultiLineComment" DefaultTokenId="15" DefaultTokenKey="MultiLineCommentText" DefaultClassificationTypeKey="Comment">
    <State.Scopes>
        <Scope>
            <Scope.StartPatternGroup>
                <ExplicitPatternGroup TokenId="16" TokenKey="MultiLineCommentStartDelimiter" Pattern="(*" />
            </Scope.StartPatternGroup>
            <Scope.EndPatternGroup>
                <ExplicitPatternGroup TokenId="17" TokenKey="MultiLineCommentEndDelimiter" Pattern="*)" />
            </Scope.EndPatternGroup>
        </Scope>
    </State.Scopes>
    <State.ChildStates>
        <StateRef Key="DocumentationComment" />
        <StateRef Key="MultiLineComment" />
    </State.ChildStates>
    <RegexPatternGroup TokenId="18" TokenKey="MultiLineCommentLineTerminator" Pattern="\n" />
</State>

Note that when defining the child states, you still have to list DocumentationComment before MultiLineComment to ensure it is matched first.

Please try those adjustments and reach back out if you need further assistance.  If the solution gets more complicated, it might be best to create a simple sample project you can send to our support email address so we can easily load exactly what you're trying and help identify specific issues.


Actipro Software Support

Posted 2 days ago by Jermaine
Avatar

Thanks for the detailed explanation, now I understand

Add Comment

Please log in to a validated account to post comments.