byte in LexicalStateAndIDTokenLexicalParseData constructor

Posted 19 years ago by Jared Phelps

Hello-
Hopefully this is a quick one...Why is the second parameter in the constructor of the LexicalStateAndIDTokenLexicalParseData class a byte, rather than an int? I assume from its usage in the Simple language that it's meant to correspond to a token ID, which are ints everywhere else that I can see.

I don't suppose most people have more than 256 token types, but I like to make my token IDs powers of two, which allows me to use bit masks to determine what kind of token it is (whitespace, comment, significant, etc) conveniently at design-time and very fast at run-time.

Thanks!
Jared

Comments (3)

Posted 19 years ago by Actipro Software Support - Cleveland, OH, USA

On IToken we define the ID as an int to allow for large implementation of it for languages that require it. However to save on memory, we use bytes in our implementations of token parse data.

If you send us an email, we can send you the source code for it so that you can create your own class that does the same thing but uses an int instead of a byte.

Actipro Software Support

Posted 19 years ago by Russell Mason

Hi

On a related subject...

What is the relationship between IToken and ITokenLexicalParseData. If I understand it the ITokenLexicalParseData just associated user specific data to the token. So why is this seperate? Why is the data not put in a derived Token class? At the risk of answering my own question (or looking foolish) is this so that you can assign different types of parse data against the same token type?

Thanks
Russell Mason

Posted 19 years ago by Actipro Software Support - Cleveland, OH, USA

Yes you have it right. The mergable languages required a way to pass lexical parse data through to be assigned to tokens. Thus the ITokenLexicalParseData is used. However if you are making a language that doesn't need to be mergable, it's more optimized to inherit your tokens from TokenBase and not deal with ITokenLexicalParseData at all. You can store a byte or int field that maybe has a couple bits for lexical state number and the rest of the bits for the token ID, etc. That is the most optimized way to store lexical parse data for non-mergable languages.

Actipro Software Support

The latest build of this product (v25.1.0) was released 22 days ago, which was after the last post in this thread.

Comments (3)

Add Comment