Parsing a long token lasts endless

SyntaxEditor for WPF Forum

Posted 7 years ago by Christel
Version: 11.2.0551
Platform: .NET 4.0
Environment: Windows 8 (64-bit)
Avatar

Hi,

In a GrammarClass using LLParserFramework I defined:

this.Root.Production = gopItem.OnError(GopItemError).ZeroOrMore().SetLabel("Definition")			.OnInitialize(DefinitionInitialize).OnSuccess(GopItemSuccess).OnComplete(DefinitionComplete)
				> Ast("Mengendefinition", AstChildrenFrom("Definition"));

// there are some more, but for the example they are deleted
gopItem.Production =  
 gopLugLg["GopItem"] > AstFrom("GopItem");  

 

gopLugLg.Production =
	@lug
	+ @openSquareBracket.OnErrorContinue()
	+ @quote.OnErrorContinue()
	+ @number["LugNum"]
	+ (@komma + @number["LugNum"] > AstFrom("LugNum")).ZeroOrMore().SetLabel("moreNum")
	+ @quote.OnErrorContinue()
	+ @komma
	+ @number["LgNum"]
	+ @closeSquareBracket.OnErrorContinue() 
	+ endItem.OnErrorContinue()
	> Ast("GopLugMitLg", AstFrom("LugNum"), AstChildrenFrom("moreNum"));

Alternatively it can be defined as a regexp in LanguageDesigner using only BaseParsing

(LUG\\[\\\")(([0-9]{1,2},?)+)\\\",([0-9]{1,2})(\\])

This is the definition of following example-string:

LUG["01,02,03,04,05,06,07,21,22,23,24,25",08]

For both scenarios - either LLParsing or BaseBarser, this parsing works perfect. The string is tokenized and the correct AST is build.

As soon as the string's first part is increased, the parsing will last very long, even the application will no longer answering:

both testing in the LLParser Debugger and my application.

LUG["01,02,03,04,05,06,07,21,22,23,24,25,01,02,03,04,05,06,07,21,22,23,24,25,26,27,28,29,30",08]

What could be the reason for it?

Is there a number of sign - limit for one token?

Can you understand my question abd problem? Do you need further information?

Thanks for any help!!!

Cheers, Christel

Comments (3)

Posted 7 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Christel,

There shouldn't be any limits that you'd run into.  I'd expect what you wrote to fully work.

The first thing you should do is use the LL Parser Debugger and step into the grammar.  Walk through this simple example to see if it's getting stuck repeating anywhere infinitely.  Perhaps there is a problem in your grammar that's causing it.

If that doesn't help solve the problem, maybe build a new simple sample project with this sort of basic grammar example in it and make sure we can open it in the LL Parser Debugger, and email that to us.  Please reference this post if you do so.


Actipro Software Support

Posted 7 years ago by Christel
Avatar

Hi,

I used the LL Parser Debugger already an tried both versions of lexical definitions. In both cases, ist lasts very very long. After 10 Minutes or so, the long string was parsed correctly.

Now I sent you a small project via email and hope for any further help.

 

Thanks!!

Christel

Answer - Posted 7 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Christel,

Thanks for the sample.  Since you saw it happening even in Live Test, the problem is the regex engine not being able to handle the complex regex patterns you are using in your lexer.  Namely, if you remove the ones in the .langproj file from LeistungsuntergruppeGOP through LeistungsuntergruppeGOP_1 (5 total) then everything seems fast.

You may need to find a way to reduce the complexity of those patterns, or if that isn't an option, switch to a hand-written programmatic lexer instead.


Actipro Software Support

The latest build of this product (v2019.1 build 0683) was released 1 month ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.