A regular expression, also known as a regex or regexp, is a string whose pattern (template) describes a set of strings. The pattern determines which strings belong to the set. A pattern consists of literal characters and metacharacters, which are characters that have special meaning instead of a literal meaning.
Matching Symbols
Regular Expression | Description |
---|---|
|
Matches any character |
|
Finds regex that must match at the beginning of the line. |
|
Finds regex that must match at the end of the line. |
|
Set definition, can match the letter a or b or c. |
|
Set definition, can match a or b or c followed by either v or z. |
|
When a caret appears as the first character inside square brackets, it negates the pattern. This pattern matches any character except a or b or c. |
|
Ranges: matches a letter between a and d and figures from 1 to 7, but not d1. |
|
Finds X or Z. |
|
Finds X directly followed by Z. |
|
Checks if a line end follows. |
Meta Characters
The following meta characters have a pre-defined meaning and make certain common patterns easier to use, e.g., \d
instead of [0..9]
.
Regular Expression | Description |
---|---|
|
Any digit, short for |
|
A non-digit, short for |
|
A whitespace character, short for |
|
A non-whitespace character, short for |
|
A word character, short for |
|
A non-word character |
|
Several non-whitespace characters |
|
Matches a word boundary where a word character is |
These meta characters have the same first letter as their representation, e.g., digit, space, word, and boundary. Uppercase symbols define the opposite.
Quantifier
A quantifier defines how often an element can occur. The symbols ?, *, + and {} define the quantity of the regular expressions
Regular Expression | Description | Examples |
---|---|---|
|
Occurs zero or more times, is short for |
|
|
Occurs one or more times, is short for |
|
|
Occurs no or one times, |
|
|
Occurs X number of times, |
|
|
Occurs between X and Y times, |
|
|
|
Grouping and back reference
You can group parts of your regular expression. In your pattern you group elements with round brackets, e.g., ()
. This allows you to assign a repetition operator to a complete group.
In addition these groups also create a back reference to the part of the regular expression. This captures the group. A back reference stores the part of the string
which matched the group. This allows you to use this part in the replacement.
Via the ${}
you can refer to a group. ${1}
is the first group, ${2}
the second, etc.
Negative look ahead
Negative look ahead provides the possibility to exclude a pattern. With this you can say that a string should not be followed by another string.
Negative look ahead are defined via (?!pattern)
. For example, the following will match "a" if "a" is not followed by "b".
a(?!b)
You can experiment with regular expressions at regexplanet.com
Comments