Presentation on theme: "Regular Expressions in Perl By Josue Vazquez. What are Regular Expressions? A template that either matches or doesn’t match a given string. Often called."— Presentation transcript:
What are Regular Expressions? A template that either matches or doesn’t match a given string. Often called a pattern in Perl.
Using Simple Patterns To match a pattern against the contents of a string, put a pair of forward slashes (/). Almost always found in the conditional expression of if or while. All of the usual backslash escapes are available in patterns
Metacharacters Special characters that have a special meaning in regular expressions. Example the dot (.) is a wildcard character, it matches any single character except a newline.
Metacharacters (Con’t) CharacterMeaning ^Beginning of string $End of string.Any character except newline \Quote or special
Simple Quantifiers Three main quantifiers: Star (*) Plus (+) Question mark (?)
Star (*) Quantifier 1. Matches the preceding item zero or more times. Example: /fred\t*barney/.* will match any character, any number of times. Example: /fred.*barney/
Plus (+) Quantifier Matches the proceeding item one or more times. Example: /fred +barney/
Question Mark (?) Quantifier Makes the preceding item optional Example: /bam-?bam/
Grouping in Patterns Parentheses may be used for grouping.
Back reference Refers to text that we matched in the parentheses Denote a back reference with a backslash followed by a number. Back reference does not have to be right next to the parentheses group.
Alternatives Described by the vertical bar (|) means that either the left side may match or the right side. Can also make patterns.
Character Classes A list of possible characters inside square brackets. () Matches any single character from within the class. Matches just one character but can be any one listed in the class.
Character Classes (Con’t) May use hyphen to specify a range of characters. A caret(^) at the start of the character class negates it.
Character Class Shortcuts Some character classes appear frequently so they have shortcuts SymbolMeaningAs Bytes \dDigit[0-9] \sWhitespace[ \t\n\r\f] \wWord character[a-zA-Z0-9_]
Negating the Shortcuts To negate the shortcuts just use the uppercase counter parts. SymbolMeaningAs Bytes \DNon-digit[^0-9] \SNon-whitespace[ ^ \t\n\r\f] \WNon-word character[^a-zA-Z0-9_]