Get docstring on function declarations

SyntaxEditor Python Language Add-on for WPF Forum

Posted 1 year ago by Tobias Lingemann - Software Devolpment Engineer, Vector Informatik GmbH
Avatar

Hi,

I think there should be an easier way to get the docstring of a function declaration via the AST node.

I see that I can look at the children nodes for a literal expression, but then I have to remove the quotes and truncate the string manually. In the C# add-on there is a extra property 'DocumentationComment'.


Best regards, Tobias Lingemann.

Comments (3)

Posted 1 year ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Tobias,

The Python AST is configured to store the raw string data for the docstring.  Keep in mind that the parser can potentially encounter a lot of docstrings while it parses through documents and it's not really worth the extra processing time to parse through each docstring to result it in a prettier format by removing prefixes/quotes (which give hints about the syntax used within), normalizing line terminators, and handling escape characters.

Rather, we do that kind of logic on-demand when the docstring is going to be displayed in IntelliPrompt popups.

What is the usage scenario you needed the docstring data for?


Actipro Software Support

Posted 1 year ago by Tobias Lingemann - Software Devolpment Engineer, Vector Informatik GmbH
Avatar

Hi,

we have a control that shows all user-defined functions from all the supported languages with their parameters and corresponding documentation. So for C# functions we show the XML documentation. We want to do the same for python.

For now we simply parse the AST manually, but since it is already available in C#, I thought other users might want that too. I didn't think it would cause any major overhead.


Best regards, Tobias Lingemann.

Posted 1 year ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Tobias,

Thank you for the information.

C#'s doccomments are actually captured in a very different way due to the nature of the language.  They are intercepted by the token reader so that the parser never sees those tokens.  Instead, as a new type or member definition is located by the parser, it will check with the token reader to see which lines of doccomments have been collected since the last reaping, and will use those as the doccomment for the type/member.  No escapes or anything are processed for these doccomments.

Python is different because the docstring is a string that is positioned right after the start of the member.  It can use any of the many Python string syntaxes, which can include various escapes.  The parser itself doesn't know it's a docstring and assumes for syntax checking that it's an expression statement.  The string's value is read raw and later on the module loader logic assigns that first string within a member body, if present, to the related definition's docstring value.  Later on, if that value is ever needed for IntelliPrompt, its delimiters are examined to see what kind it is, and things like character escapes are handled per character.  This was only done on the fly as needed to avoid doing extra unnecessary string parsing.

That being said, we will update it for v23.1 to try and clean up the docstring so that it is a nice value for you.


Actipro Software Support

The latest build of this product (v24.1.1) was released 2 months ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.