Data Engineering Integration
- Data Engineering Integration 10.2.1
- All Products
Metacharacter
| Description
|
---|---|
.
| Matches any single character.
|
[ ]
| Indicates a character class. Matches any character inside the brackets. For example, [abc] matches “a,” “b,” and “c.”
|
^
| If this metacharacter occurs at the start of a character class, it negates the character class. A negated character class matches any character except those inside the brackets. For example, [^abc] matches all characters except “a,” “b,” and “c.”
If this metacharacter occurs at the beginning of the regular expression, it matches the beginning of the input. For example, ^[abc] matches the input that begins with “a,” “b,” or “c.”
|
-
| Indicates a range of characters in a character class. For example, [0-9] matches any of the digits “0” through “9.”
|
?
| Indicates that the preceding expression to this metacharacter is optional. It matches the preceding expression zero or one time. For example, [0-9][0-9]? matches “2” and “12.”
|
+
| Indicates that the preceding expression matches one or more times. For example, [0-9]+ matches “1,” “13,” “666,” and similar combinations.
|
*
| Indicates that the preceding expression matches zero or more times. For example, the input <abc*> matches <abc>, <abc123>, and similar combinations that contains <abc> as the preceding expression.
|
??, +?, *?
| Modified versions of ?, +, and *. These match as little as possible, unlike the versions that match as much as possible. For example, the input “<abc><def>,” <.*?> matches “<abc>” and the input <.*> matches “<abc><def>.”
|
( )
| Grouping operator. For example, (\d+,)*\d+ matches a list of numbers separated by commas such as “1” or “1,23,456.”
|
{ }
| Indicates a match group.
|
\
| An escape character, which interprets the next metacharacter literally. For example, [0-9]+ matches one or more digits, but [0-9]\+ matches a digit followed by a plus character. Also used for abbreviations such as \a for any alphanumeric character.
If \ is followed by a number
n , it matches the nth match group, starting from 0. For example, <{.*?}>.*?</\0> matches “<head>Contents</head>”.
In C++ string literals, two backslashes must be used: “\\+,” “\\a,” “<{.*?}>.*?</\\0>.”
|
$
| At the end of a regular expression, this character matches the end of the input. For example, [0-9]$ matches a digit at the end of the input.
|
|
| Alternation operator that separates two expressions, one of which matches. For example, T|the matches “The” or “the.”
|
!
| Negation operator. The expression following ! does not match the input. For example, a!b matches “a” not followed by “b.”
|
Abbreviation
| Definition
|
---|---|
\a
| Any alphanumeric character, ([a-zA-Z0-9]).
|
\b
| White space (blank), ([ \\t]).
|
\c
| Any alphabetic character, ([a-zA-Z]).
|
\d
| Any decimal digit, ([0-9]).
|
\h
| Any hexadecimal digit, ([0-9a-fA-F]).
|
\n
| Newline, (\r|(\r?\n)).
|
\q
| Quoted string, (\”[^\”]*\”)|(\’[^\’]*\’).
|
\w
| Simple word, ([a-zA-Z]+).
|
\z
| Integer, ([0-9+]).
|