Regular Expressions
A regular expression, regex or regexp (sometimes called a rational expression) is, in theoretical computer science and formal language theory, a sequence of characters that define a search pattern. Usually this pattern is then used by string searching algorithms for “find” or “find and replace” operations on strings, or for input validation.
Source: Wikipedia
Character | Description |
---|---|
^ | A circumflex at the start of the string matches the start of a line. |
$ | A dollar sign at the end of the expression matches the end of a line. |
. | A period matches a single instance of any character. For example, b.t matches bot, and bat, but not boat. |
? | A question mark after a character or a character group matches zero or one occurrences of that character or group. For example, bo?t matches both bt and bot. |
* | An asterisk after a character or a character group matches any number of occurrences of that character or group, including zero occurrences. For example, bo*t matches bt, bot, and boot. |
+ | A plus sign after a character or a character group matches any number of occurrences of that character or a character group, with at least one occurrence. For example, bo+t matches bot and boot, but not bt. |
| | A vertical bar matches each expression on each side of the vertical bar. For example, bar|car will match either bar or car. |
[ ] | Characters inside square brackets match any character that appears in the brackets, but no others. For example, [bot] matches b, o, or t. |
[^] | A circumflex at the start of a string inside square brackets means NOT. Hence, [^bot] matches any characters except b, o, or t. |
[-] | A hyphen inside square brackets signifies a range of characters.
For example, [b-o] matches any character from b through o. |
{ } | Braces group characters or expressions. Groups can be nested, with a maximum number of 10 groups in a single pattern. For the Replace operation, groups are referred to by a backslash and a number, according to the position in the "Text to find" expression, beginning with 0. For example, given the text to find and replacement strings, Find: {[0-9]}{[a-c]*}, Replace: NUM\1, the string 3abcabc is changed to NUMabcabc. |
( ) | Parenthesis are an alternative to braces ({ }), with the same behavior. |
\ | A backslash before a wildcard character tells the Code Editor to treat that character literally, not as a wildcard. For example, \^ matches ^ and does not look for the start of a line. |
For more technologies supported by our ETL Software see Advanced ETL Processor Versions and Visual Importer ETL Versions
Confused? Ask question on our ETL forum