Parser error when using a lexical transition - SyntaxEditor Web Languages Add-on for WPF Forum

Posted 13 years ago by Craig - Varigence, Inc.

Version: 12.2.0570

Say I have a file like this:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <OleDbConnection Name="OLEDB_CON_EDW_STG2">
        </OleDbConnection>
        <FlatFileConnection FilePath="C:\EDW\Data\<#=cycleTypeCd#>\<#=table.Name#>.dat">
            <Expressions>
                <Expression PropertyName="ConnectionString">
                </Expression>
            </Expressions>
        </FlatFileConnection>
    </Connections>
</Biml>

The <# and #> delimiters are lexical transitions between the custom XML language and C#, using the Syntax Editor's lexical transition logic. We have intellisense and quick info working but I'm hitting a problem.

The Problems:

1. When I hover over the <Expressions> tag, directly under the <FlatFileConnection> tag, my Quick Info logic believes that <Expressions> is an invalid child element. Further, the returned element hierarchy is: Biml / Connections, although it should be Biml / Connections / FlatFileConnection.

2. A related issue is when I hover over 'C:\EDW\Data\' in the FlatFileConnection's FilePath attribute, I see the expected Quick Info content, describing the FilePath attribute. However, if I hover 'dat' at the end of the attribute value, the Quick Info pop-up displays information about the FlatFileConnection element, instead of the FilePath attribute.

Possible Cause:

The XML parser is reporting an error on the line with the start FlatFileConnection tag. The error is: "Attribute end value delimiter expected". The error's position is at the backslash character that's between the two C# delimited code 'nuggets'. I suspect this error is contributing to the incorrect element hierarchy when hovering over the Expressions tag.

As mentioned near the bottom of this post, I'm using a custom XmlTokenReader to filter out non-XML tokens via an overriden GetNextToken() method. As GetNextToken() is called during a parse, I see that it returns three attribute-value tokens in succession:

token text = C:\EDW\Data\, id = 11
token text = \, id = 11
token text = dat, id = 11

My theory, regarding the parser error, is that the parser expects an attribute-end-delimiter token to immediately follow the first attribute-value token, and gets confused when two more attribute-value tokens arrive instead. While I do have the source code for the XML and Dot Net add-ons, that isn't sufficient to confirm if my theory is correct.

Questions:

1. Does my theory, as to the cause of the parse error, seem reasonable?
2. If so, what's the right way for me to solve this? I speculate that I could workaround this by somehow combining consecutive attribute value tokens but that seems kludgy.
3. If my theory doesn't sound right, do you have any notion on what might be the problem here, or thoughts on how can I debug this?

Thanks,

-Craig

P.S.

I looked at this some more and believe the parse error stems from the XmlGrammar in the add-on. Specifically, in the element.Production assignment, the grammar has:

@startTagAttributeValueText["attrValue"].Optional().OnSuccess(AttributeValueSuccess) +

For the above scenario to work, it needs to be something like:

@startTagAttributeValueText.ZeroOrMore().SetLabel("attrValue").OnSuccess(AttributeValueSuccess) +

By allowing zero or more attribute-value text terminals, my scenario works. Additionally, there are no parse errors in the AST and the Quick Info pop-up over the Expressions tag displays the correct information.

The only catch is that problem #2 above remains. I don't know if that's due to an issue in the built-in Quick Info provider or an issue on our end.

[Modified 13 years ago]

Comments (7)

Answer - Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Craig,

I think you're exactly right. It's expecting one attribute value token but is getting multiple ones. So your #2 idea of combining the tokens into a single token might be the best way to go. Or if you change the XML grammar as you described later, that would handle it too.

For the other issue, I believe the context factory may be coded to expect a single attribute value token. So the idea of combining multiple tokens into one instead of changing the grammar might help there.

Actipro Software Support

Posted 13 years ago by Craig - Varigence, Inc.

Two questions:

1. How could I combine tokens using the Actipro add-on? I don't see anything like that mentioned in the documentation.

2. Is there any chance you'd be willing to alter the grammar, in an update, to use the ZeroOrMore() token choice?

Thanks again,

-Craig

[Modified 13 years ago]

Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Craig,

1) There's nothing built in to do that sort of thing. You'd need to make up new mock tokens in your token reader and pass them up that way.

2) I think your second problem is due to the context factory not expecting multiple tokens. So both the grammar and that would need to have code updated for this and in ordinary circumstances, there's no issue with how they are now. I'd suggest trying the token combining route first.

Actipro Software Support

Posted 13 years ago by Craig - Varigence, Inc.

More questions:

1. I came up with a solution to merge the tokens in the token reader. The key portion of the solution is this:

private IToken MergeTokens()
{
    IToken mergableToken = null;

    var attributeValueTextToken = _cachedTokens.FirstOrDefault(item => item.Id == XmlTokenId.StartTagAttributeValueText);
    if (attributeValueTextToken != null)
    {
        // TODO: Can Actipro expose the LexerData property via an interface so we don't need Reflection?
        IMergableTokenLexerData lexerData = null;
        var lexerDataPropInfo = attributeValueTextToken.GetType().GetProperty("LexerData", BindingFlags.Instance | BindingFlags.NonPublic);
        if (lexerDataPropInfo != null)
        {
            lexerData = lexerDataPropInfo.GetValue(attributeValueTextToken, null) as IMergableTokenLexerData;
        }

        mergableToken = new MergableToken(_cachedTokens[0].StartOffset, _cachedTokens.Sum(item => item.Length), _cachedTokens[0].StartPosition, _cachedTokens[_cachedTokens.Count - 1].EndPosition, MergableLexerFlags.None, ((IMergableToken)attributeValueTextToken).LexicalState, lexerData);
    }

    _cachedTokens.Clear();
    return mergableToken;
}

Is this the proper way to merge multiple attribute text tokens together so the parser sees them as one?

2. In an update, can you expose the LexerData property on IToken or IMergableToken, so I don't need to use reflection when creating a new token?

3. Even with this change, my second problem still isn't solved. Hovering over the dat portion of the attribute's text still returns quick info for the element, not the attribute. Thus, it still seems like there's a bug in the generated XmlContext.

4. Also, I have to disagree with your statement that 'in ordinary circumstances', there's no issue with how the grammar and context factory behave now. While I realize I'm using a customize XML based language, what I'm trying to do would be perfectly legal in ASP (multiple ASP code nuggets with an html attribute). Built-in support for this type of embeding would be very helpful.

Thanks again,

-Craig

[Modified 13 years ago]

Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Craig,

1/2) Since these tokens are just being used by the parser, they probably don't need to be complex at all. As long as they implement IToken and have Id and the text range-related props set properly, I'd think that would be all you need to do. The parser just examines token IDs as it iterates tokens, and the code that tracks offsets for the AST nodes and error reporting uses the text range data.

3) That one is hard to say what the problem is without debugging it. I think you said you have the Blueprint source code. Perhaps you can run through the context factory code there and see where it's going wrong.

4) All I'm saying is that we built the add-on according to XML specs and it does meet those properly. Your scenario is where you're merging it with other languages, which does alter the grammar and/or context factory a bit from what works normally. I don't really know the extent of the changes that would be necessary to support your scenario properly without debugging it. Once you get #1 working as mentioned, I'm curious what you find in regards to where the context factory in #3 goes wrong.

Actipro Software Support

Posted 13 years ago by Craig - Varigence, Inc.

1/2) I'm not quite clear on your recommendation here. Are you proposing that I author my own Token class that implements IToken, as opposed to using MergableToken?

3) I did a quick debug session and found the problem is that XmlContextFactory's CreateContext method doesn't use XmlParseData or the custom TokenReader I created when reading tokens. Instead, it just obtains a reader from the snapshotOffset's Snapshot. The result is that when examining an attribute with embedded code nuggets, it sees the attribute's text as multiple tokens instead of a single attribute text token.

From my original sample, the first portion of the attribute value (before the first code nugget) works correctly; the correct targetAttribute is identified, along with the correct targetElement. However, the other attribute text pieces are processed as individual tokens. After the call to UpdateTypeAndTextRange(), the targetAttribute is null and the XmlContext's Type is StartTagOther.

Posted 13 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Craig,

1/2) Yes, that might be easiest or make a class that inherits TokenBase.

3) That makes sense since the token reader only gets used by the grammar-based IParser. You'll probably have to modify the context factory code to accommodate the other tokens it can see, so that you skip over the tokens from the other language.

Actipro Software Support

The latest build of this product (v25.1.0) was released 2 months ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.