Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular Expressions in Perl Part I Alan Gold. Basic syntax =~ is the matching operator !~ is the negated matching operator // are the default delimiters.

Similar presentations


Presentation on theme: "Regular Expressions in Perl Part I Alan Gold. Basic syntax =~ is the matching operator !~ is the negated matching operator // are the default delimiters."— Presentation transcript:

1 Regular Expressions in Perl Part I Alan Gold

2 Basic syntax =~ is the matching operator !~ is the negated matching operator // are the default delimiters Prefixing the expression with “m” allows for arbitrary delimiters: e.g. m%Don’t use this% Modifiers follow the closing delimiter

3 Simple matching “Hello World” =~ /Hello/ Matches the literal string “Hello” “Superman” =~ /Kal-El/ Unfortunately does not match

4 Metacharacters Metacharacters are {}[]()^$.|*+?\ These must be escaped with a “\” to match their literal characters “Spoon+fork” =~ /Spoon+/ will match, but not how you want it to “Spoonnnnnn” =~ /Spoon+/ will also match “Spoon+fork” =~ /Spoon\+/ matches properly

5 Escape sequences Several characters can’t be printed directly They are matched using an escape sequence \t is a tab character (ASCII code 9) \n is a newline character (ASCII code 10) \r is a carriage return (ASCII code 13) \0.. Is an octal character, e.g. \033 \x.. Is a hexidecimal character, e.g. \x1B

6 Variables Variables can be used in regular expressions similarly to double-quoted strings $something = “cool”; ‘cool cruel pool’ =~ /$something/ Will match just fine

7 Anchors ^ anchors the pattern to the beginning of the string $ anchors to the end “Speaker” =~ /^peak/ Will not match “Rabbit” =~ /bit$/ Will match

8 Character classes Character classes match any character contained in [brackets] /tin[yas]/ will match tiny, tina, and tins “-” can be used to represent a range /[a-zA-Z0-9]/ will match a single alphanumeric character The literal “-” character can be matched if it is the first or last character, e.g. /[-0-9]/

9 Negated character classes The “^” character negates a character class /200[^7]/ will not match 2007 but will match 2008, 200q, etc.

10 Shortcut character classes \d is a digit, equivalent to [0-9] \s is any whitespace, equivalent to [\ \t\r\n\f] \w is a word character, eq. [0-9a-zA-Z_] \D is any non-digit, eq. [^0-9] \S is any non-whitespace, eq. [^\s] \W is any non-word, eq. [^\w] The period ‘.’ matches any character but ‘\n’

11 Word anchors The word anchor ‘\b’ matches the boundary between a word character and non-word character /\bpen/ matches “penitentiary”, not “open” /\bpen\b/ only matches “pen” if surrounded by non-words, e.g. “this pen is blue”

12 Modifiers Modifiers change the behavior of the engine // is the default, ‘.’ doesn’t match newlines //s causes ‘.’ to match newlines //m treats each line as its own string //i matches case-insensitively Modifiers can be combined, e.g. //sim /^car.$/im matches “not a car\nCAR!”

13 Or The pipe character ‘|’ can be used to match any one of the given choices /lumber|wood/ will match “My desk is made of spare lumber” and “My desk is made of 100,000 year old petrified wood” /0|1|2/ is equivalent to [0-2]

14 A blank slide


Download ppt "Regular Expressions in Perl Part I Alan Gold. Basic syntax =~ is the matching operator !~ is the negated matching operator // are the default delimiters."

Similar presentations


Ads by Google