Regular Expressions 

Regular Expressions are powerful search expressions which can perform advanced pattern recognition and validation.
The following table defines the meta characters of the regular expression language:
Character 
Definition 
Pattern 
Sample Matches 
^ 
Start of a string. 
^abc 
abc, abcdefg, abc123, ... 
$ 
End of a string. 
abc$ 
abc, endsinabc, 123abc, ... 
. 
Any character (except \n newline) 
a.c 
abc, aac, acc, adc, aec, ... 
 
Alternation. 
billted 
ted, bill 
{...} 
Explicit quantifier notation. 
ab{2}c 
abbc 
[...] 
Explicit set of characters to match. 
a[bB]c 
abc, aBc 
(...) 
Logical grouping of part of an expression. 
(abc){2} 
abcabc 
* 
0 or more of previous expression. 
ab*c 
ac, abc, abbc, abbbc, ... 
+ 
1 or more of previous expression. 
ab+c 
abc, abbc, abbbc, ... 
? 
0 or 1 of previous expression; also forces minimal matching when an expression might match several strings within a search string. 
ab?c 
ac, abc 
\ 
Preceding one of the above, it makes it a literal instead of a special character. Preceding a special matching character, see below. 
a\sc 
a c 
The following table contains the escape sequences used in authoring regular expressions:
Character 
Description 
ordinary characters 
Characters other than . $ ^ { [ (  ) ] } * + ? \ match themselves. 
\a 
Matches a bell (alarm) \u0007. 
\b 
Matches a backspace \u0008 if in a []; otherwise matches a word boundary (between \w and \W characters). 
\t 
Matches a tab \u0009. 
\r 
Matches a carriage return \u000D. 
\v 
Matches a vertical tab \u000B. 
\f 
Matches a form feed \u000C. 
\n 
Matches a new line \u000A. 
\e 
Matches an escape \u001B. 
\040 
Matches an ASCII character as octal (up to three digits); numbers with no leading zero are backreferences if they have only one digit or if they correspond to a capturing group number. For example, the character \040 represents a space. 
\x20 
Matches an ASCII character using hexadecimal representation (exactly two digits). 
\cC 
Matches an ASCII control character; for example \cC is controlC. 
\u0020 
Matches a Unicode character using a hexadecimal representation (exactly four digits). 
\* 
When followed by a character that is not recognized as an escaped character, matches that character. For example, \* is the same as \x2A. 
The following table contains character classes used in regular expressions:
Char Class 
Description 
. 
Matches any character except \n. If modified by the Single line option, a period character matches any character. For more information, see Regular Expression Options. 
[aeiou] 
Matches any single character included in the specified set of characters. 
[^aeiou] 
Matches any single character not in the specified set of characters. 
[09afAF] 
Use of a hyphen (–) allows specification of contiguous character ranges. 
\p{name} 
Matches any character in the named character class specified by {name}. Supported names are Unicode groups and block ranges. For example, Ll, Nd, Z, IsGreek, IsBoxDrawing. 
\P{name} 
Matches text not included in groups and block ranges specified in {name}. 
\w 
Matches any word character. Equivalent to the Unicode character categories [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}]. If ECMAScriptcompliant behavior is specified with the ECMAScript option, \w is equivalent to [azAZ_09]. 
\W 
Matches any nonword character. Equivalent to the Unicode categories [^\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}]. If ECMAScriptcompliant behavior is specified with the ECMAScript option, \W is equivalent to [^azAZ_09]. 
\s 
Matches any whitespace character. Equivalent to the Unicode character categories [\f\n\r\t\v\x85\p{Z}]. If ECMAScriptcompliant behavior is specified with the ECMAScript option, \s is equivalent to [ \f\n\r\t\v]. 
\S 
Matches any nonwhitespace character. Equivalent to the Unicode character categories [^\f\n\r\t\v\x85\p{Z}]. If ECMAScriptcompliant behavior is specified with the ECMAScript option, \S is equivalent to [^ \f\n\r\t\v]. 
\d 
Matches any decimal digit. Equivalent to \p{Nd} for Unicode and [09] for nonUnicode, ECMAScript behavior. 
\D 
Matches any nondigit. Equivalent to \P{Nd} for Unicode and [^09] for nonUnicode, ECMAScript behavior. 
Copyright © 2024 pasUNITY, Inc.
Send comments on this topic.