Dynamic Lexical Macros
The built-in regular expression engine, when used with dynamic lexers, allows for the definition of macros that represent regular expression elements. These macros are valid for use in any regular expression within a dynamic lexer and promotes reusability of common patterns.
Usage
Using defined macros is easy. To reference a macro, simply type its name within curly braces; e.g., {MacroName}
.
This regular expression uses a macro that represents the character class [0-9]
to build a decimal number regular expression.
{Digit}+ (\. {Digit}+)?
This regular expression builds a C# identifier using two macros.
(_ | {Alpha})({Word})*
Built-In Macros
The regular expression engine recognizes a number of built-in macros. If a dynamic lexer defines a lexical macro of the same key as a built-in lexical macro, the user's definition will override the system definition.
The following table summarizes all of the built-in macros:
Name | Description |
---|---|
{All} |
Contains all Unicode characters. This is the same as: [\u0000-\uFFFF] . |
{Alpha} |
Contains all Unicode alphabetic digits. This is the same as \p{L} (all letters). |
{Digit} |
Contains all Unicode decimal digits. This is the same as \d and \p{Nd} . |
{HexDigit} |
Contains all Unicode hexadecimal digits. This is the same as [0-9a-fA-F] . |
{LineTerminator} |
Contains all Unicode line terminators. This is the same as [\n\r\p{Zl}\p{Zp}] . |
{LineTerminatorWhitespace} |
Contains all Unicode line terminators and whitespace characters. This is the same as \s or [\f\n\r\t\v\x85\p{Z}] . |
{NonAlpha} |
Contains the inverse of Alpha . |
{NonDigit} |
Contains the inverse of Digit . |
{None} |
Contains no characters. |
{NonHexDigit} |
Contains the inverse of HexDigit . |
{NonLineTerminator} |
Contains the inverse of LineTerminator . |
{NonLineTerminatorWhitespace} |
Contains the inverse of LineTerminatorWhitespace . |
{NonWhitespace} |
Contains the inverse of Whitespace . |
{NonWord} |
Contains the inverse of Word . |
{Whitespace} |
Contains all Unicode whitespace characters. This is the same as [\f\t\v\x85\p{Zs}] . |
{Word} |
Contains all Unicode word characters. This is the same as \w and [\p{L}\p{Nd}\p{Pc}] (all letters, decimal digits, and connectors like underscore). |