Tokens

A token is a span of text within a snapshot that has some sort of lexical parse data associated with it. Tokens are used throughout the product, in various types of parsing, classification, and text scanning.

Token Basic Properties

All tokens are represented by classes implementing the IToken interface. This interface defines properties that fall into two categories: identification and location.

Identification properties provide the most basic data for identifying what a token represents. The Id and Key properties are used for this purpose. The Id property is an integer number and the Key is a String value. Either or both can be used to provide identification of the token, although the ID is preferred since integer comparisons are faster than String comparisons.

Location properties give information about the size and location of the token within its text source. The StartOffset and EndOffset properties specify the start/end offsets of the token. The end offset is the offset after the last character contained by the token. The Length property gives the number of characters in the token. The StartPosition and EndPosition properties specify the start/end TextPosition objects that relate to the offsets. The TextRange and PositionRange properties provide the related ranges.

Note

See the Offsets, Ranges, and Positions topic for more information on offsets and text positions.

Mergable Tokens

Mergable tokens are represented by classes implementing the IMergableToken interface, which inherits the IToken interface. Mergable tokens are the type of token that is created when using a mergable lexer, meaning a lexer from one language that can merge its results with another language's mergable lexer. This is commonly seen in languages like HTML where multiple child languages like CSS, JavaScript, etc. can be used.

Mergable tokens provide more information about how the token was created, such as its owner ILexicalState, owner ILexicalScope (if appropriate), and the root IMergableLexer

Another key feature provided by the IMergableToken interface is the IClassificationType associated with the token, available via the IMergableToken.ClassificationType property. A classification type is a logical category such as Identifier, Comment, or Whitespace.

Token Classes

If you wish to create a bare bones class that implements IToken for your language, and your language's lexer is not mergable, you can inherit the TokenBase class. It is abstract and only requires that you implement its Id property and optionally the Key property.

For languages that use mergable lexers, you can use the non-abstract MergableToken class for your tokens. Typically you don't even need to worry about this though since the MergableLexerBase class, which most IMergableLexer implementations inherit, automatically creates MergableToken instances for you.

In This Article

Tokens

Token Basic Properties

Note

Mergable Tokens

Token Classes