Access TextRange of grammar match during tree construction?

Posted 14 years ago by Matthew LeRoy

Version: 11.1.0545

I'm working on refining the constructed AST for an expression grammar I'm writing using the LL(*) framework, and I'd like to know if it's possible to access the offsets or TextRange for a given EBNF match for use in the construction of the AST node for the corresponding non-terminal?

The practical application for this is that I've created a type-specific AST node, FunctionCallAstNode, using the Language Designer tool, which represents a function call (clever, eh?) in the expression. I'd like to capture the TextRanges of the open and close parentheses and the commas between each argument to the function. Currently the TokenReader for my parser skips whitespace characters, and the custom tree construction I've written for the FunctionCall and FunctionCallArgumentList non-terminals (which construct a FunctionCallAstNode) throw away the matches for the open and close parentheses and the commas between arguments, since they're superfluous for the purposes of evaluating the expression (all I need to know is the function being called and the list of arguments). However, I've also got a requirement to be able to determine which function argument corresponds to any given caret position in the expression, in order to highlight the corresponding parameter in the signature of the function (similar to how Visual Studio highlights parameters in a tooltip when you're coding a function call in C++ or C#). I can check if the caret position intersects with the TextRange of any of the function argument node matches fairly easily, but if there is whitespace between a given argument expression and the surrounding delimiters (open or close parenthesis or comma), and the caret is positioned somewhere in that whitespace, I need to be able to determine that, which I can't currently do because I don't have an AST node representing that whitespace (because the whitespace tokens never make it to the parser). But, I figure that if I knew the offsets/TextRanges of the various delimiters, I could create a TextRange representing any whitespace between an argument expression node and the delimiters on either side of that argument expression, and check if the caret position intersects with either of those whitespace TextRanges.

So, what I'm hoping to do is add some additional properties to my FunctionCallAstNode that will hold the offsets/TextRanges of the open paren, close paren, and any commas between function arguments, and set those properties in the tree construction code that is part of the grammar definition. I just need to know how to access the offsets/TextRanges for the matches on the parenthesis and commas, if it's even possible. Or, do I need to write a completely custom AST node to get access to that information?

Thanks!

Comments (1)

Posted 14 years ago by Actipro Software Support - Cleveland, OH, USA

Hi Matthew,

Each IAstNodeMatch provides access to its contained IAstNode via the Node property. And the AST nodes have the offset range on them. So in our .NET Languages Add-on, sometimes we access the offsets for a node that way in our custom tree construction nodes.

But if you are doing parameter info (BTW new IntelliPrompt parameter info features have been added for the 2011.2 version - see our blog), it's probably better to do the argument index scanning via tokens though. The reason is that the AST is often out of date, especially in large documents. So the user could have just typed characters like commas that increment the argument index but the AST wouldn't know that yet. If you do token scanning you can figure out that data based on what is in the document at the time. That's what we do in our .NET Languages Add-on updates for the next version.

Actipro Software Support

The latest build of this product (v25.1.1) was released 23 days ago, which was after the last post in this thread.

Comments (1)

Add Comment