Dynamic Lexical Macros
The built-in regular expression engine, when used with dynamic lexers, allows for the definition of macros that represent regular expression elements. These macros are valid for use in any regular expression within a dynamic lexer and promotes reusability of common patterns.
Usage
Using defined macros is easy. To reference a macro, simply type its name within curly braces ({ ... }).
This regular expression uses a macro that represents the character class [0-9] to build a decimal number regular expression.
{Digit}+ (\. {Digit}+)?
This regular expression builds a C# identifier using two macros.
(_ | {Alpha})({Word})*
Built-In Macros
The regular expression engine recognizes a number of built-in macros. If a dynamic lexer defines a lexical macro of the same key as a built-in lexical macro, the user's definition will override the system definition.
The following table summarizes all of the built-in macros:
Name | Description |
---|---|
All | Contains all Unicode characters. This is the same as: [\u0000-\uFFFF] . |
Alpha | Contains all Unicode alphabetic digits. This is the same as \p{L} (all letters). |
Digit | Contains all Unicode decimal digits. This is the same as \d and \p{Nd} . |
HexDigit | Contains all Unicode hexidecimal digits. This is the same as [0-9a-fA-F] . |
LineTerminator | Contains all Unicode line terminators. This is the same as [\n\r\p{Zl}\p{Zp}] . |
LineTerminatorWhitespace | Contains all Unicode line terminators and whitespace characters. This is the same as \s or [\f\n\r\t\v\x85\p{Z}] . |
NonAlpha | Contains the inverse of Alpha . |
NonDigit | Contains the inverse of Digit . |
None | Contains no characters. |
NonHexDigit | Contains the inverse of HexDigit . |
NonLineTerminator | Contains the inverse of LineTerminator . |
NonLineTerminatorWhitespace | Contains the inverse of LineTerminatorWhitespace . |
NonWhitespace | Contains the inverse of Whitespace . |
NonWord | Contains the inverse of Word . |
Whitespace | Contains all Unicode whitespace characters. This is the same as [\f\t\v\x85\p{Zs}] . |
Word | Contains all Unicode word characters. This is the same as \w and [\p{L}\p{Nd}\p{Pc}] (all letters, decimal digits, and connectors like underscore). |