Semantic parsing of strings in dynamic language

SyntaxEditor for Windows Forms Forum

Posted 13 years ago by David Chang - Software Engineer, a.i. solutions, Inc.
Version: 4.0.0237
Avatar
I have defined strings in my dynamic language as follows:

<States>

<State Key="StringState" TokenKey="StringDefaultToken" Style="StringDefaultStyle">

<Scopes>
<Scope>
<ExplicitPatternGroup Type="StartScope" TokenKey="StringStartToken" Style="StringDelimiterStyle" PatternValue="&quot;" />
<RegexPatternGroup Type="EndScope" TokenKey="StringEndToken" Style="StringDelimiterStyle" PatternValue="\&quot;" />
</Scope>
</Scopes>

<PatternGroups>
<RegexPatternGroup TokenKey="StringEscapedCharacterToken" PatternValue="\\x {HexDigitMacro}{1,4}" />
<RegexPatternGroup TokenKey="StringEscapedCharacterToken" PatternValue="\\u {HexDigitMacro}{4,4}" />
<RegexPatternGroup TokenKey="StringEscapedCharacterToken" PatternValue="\\U {HexDigitMacro}{8,8}" />

<RegexPatternGroup TokenKey="StringWhitespaceToken" PatternValue="{WhitespaceMacro}+" IsWhitespace="True" />
<RegexPatternGroup TokenKey="StringWordToken" PatternValue="\w+" />
<RegexPatternGroup TokenKey="StringDefaultToken" PatternValue="[^\&quot;]" />
</PatternGroups>
</State>
<State Key="DefaultState">
...
<ChildStates>
<ChildState Key="StringState" />
</ChildStates>
</State>
</States>

In my parser I would like to recognize a string in the DefaultState and parse the string sequence into a ASTNode. I assume the way I would do this is by doing the following:

Match(StringStartToken);
//Slurp in the string content
Match(StringEndToken);
AstNode node = new StringNode();

Am I on the right track?

[Modified at 01/02/2007 01:09 PM]

Comments (1)

Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar
Yes that is probably the right track although instead of calling Match methods directly you may wish to use our parser generator's token matching syntax (like 'StringStartToken').

Also, it may be easier (and faster) if you just make strings into one token that is expressed by a single regular expression. That way you can just parse the single token as a string literal.


Actipro Software Support

The latest build of this product (v2020.1 build 0400) was released 5 days ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.