Download presentation
Presentation is loading. Please wait.
Published byNancy York Modified over 9 years ago
1
Karthik Sangaiah
2
Developed by Larry Wall ◦ “There’s more than one way to do it” ◦ “Easy things should be easy and hard things should be possible” Main purpose of Perl was for text manipulation Regular Expressions fundamental to text processing
3
String that describes a pattern Simplest regex is a word A regex consisting of a word matches any string that contains that word Ex: ◦ “Hello World” =~ /World/
4
“=~” operator produces TRUE if regex matches a string Ex: ◦ if (“Sample Words” =~ /Sample/) { print “It matches\n”; else { print “It doesn’t match\n”; } “!~” operator produces TRUE of regex does NOT match a string Ex: ◦ if (“Sample Words” !~ /Sample/) { print “It doesn’t match\n”; else { print “It matches\n”; }
5
Can use variable as regex Ex: $temp = “ls” “ls - l” =~ /$temp/ If using default variable “$_”: ◦ “$_ =~” can be omitted Ex: $_ = “ls -l”; if (/ls/) { print “It matches\n”; } else { print “It doesn’t match\n”; }
6
Regexs in Perl are mostly treated as double- quoted Strings Values of variables in regex will be subtituted in before regex is evaluated for matching Ex: $foo = ‘vision’; ‘television’ =~ /tele$foo/;
7
“/ /” default delimiters can be changed to arbitrary delimiters by using “=~ m” Ex: “Sample Text” =~ m!Text!; “Sample Text” =~ m{Text}; “Sample Text” =~ m“Text”;
8
Reserved for use in regex notations ◦ { }, [ ], ( ), ^, $,., |, *, +, ?, \ Need to use “\” before use of a metacharacter in the regex Ex: ◦ “5*2=10" =~ /5\*2/; ◦ "/usr/bin/perl" =~ /\/usr\/bin\/perl/; “/” also needs to be backslashed if it’s used as the delimiter
9
“^” matches at beginning of string “$” matches at end of string or before new line at end of string Ex: “television” =~ /^tele/; “television” =~ /vision$/; When using “^” and “$”, regex has to match in beginning and end of string (i.e. match whole string). Ex: “vision” =~ /^vision$/;
10
Allows a set of possible characters, rather than a single character to match Character classes denoted by […] with a set of characters matched inside Ex. /[btc]all/; #Matches ball, tall, or call /word[0123456789]/; #Matches word0…word9
11
Special characters in character class are handled with backslash as well Special characters within character class: ◦ “-”, “]”, “\”, “^”, “$”, “.”, “]” Ex: /[\$c]w/; #matches $w or cw $x = ‘btc’; /[$x]all/; #matches ball, tall, or call /[\$x]all/; #matches $all or xall /[\\$x]all/; #matches \all, ball, tall, or call
12
Special Char. “-” used as range operator Ex: /word[0-9]/; #matches word0…word9 /word[0-9a-z] /; #matches word0… word9, or worda… wordz Special Char. “^” in first position of character class denotes a negated character class Ex: /[^0-9]/; #matches a non-numeric character
13
Common character class abbreviations: ◦ \d – digit, [0-9] ◦ \s – whitespace character, [\ \t\r\n\f] ◦ \w – word character(alphanumeric or _), ◦ \D – negated \d ◦ \S – negated \s ◦ \W – negated \w ◦. – any character but “\n” Abbreviations can be used inside and outside character classes
14
“\b” matches boundary between a word character and a non-word character Ex: ◦ $x = “Exam1 Question from Sample Exam”; ◦ $x =~ /Exam/; #matches Exam in Exam1 ◦ $x =~ /\bExam/; #matches cat in Exam ◦ $x =~ /\bExam\b/; #matches cat at end of string
15
Often, we want to match against lines and ignore newline characters Sometimes we need to keep track of newlines. //s – Single line matching //m – Multi-line matching These modifiers affect two aspects how the regex is interpreted: ◦ How the ‘.’ character class is defined ◦ Where the anchor, ^ and $, are able to match
16
No modifier (//) – Default ◦. matches all characters but \n ◦ ^ matches at beginning of string ◦ $ matches at end of string or before a newline at the end of string String as Single long line (//s) ◦. matches any character ◦ ^ matches at beginning of string ◦ $ matches end of string or before a newline at the end of string
17
String as Multiple lines (//m) ◦. matches all characters but \n ◦ ^ matches at beginning of any line within the string ◦ $ matches end of any line within the string String as Single long line but detect mutliple lines (//sm) ◦. matches any character ◦ ^ matches at beginning of any line within the string ◦ $ matches end of any line within the string
18
$x = “You will know how to use Perl\nFor text processing\n"; $x =~ /^For/; # No match, “For" not at start of string $x =~ /^For/s; # No match, “For" not at start of string $x =~ /^For/m; # match, “For" at start of second line $x =~ /^For/sm; # match, “For" at start of second line
19
Alternation metacharacter “|” ◦ Used to match different possible words or character strings ◦ Word 1 or word 2 -> /word1|word2/; Perl tries to match the regex at earliest possible point in the string Ex. “shoes and strings” =~ /shoes/strings/and/; #matches shoes “shoes” =~ /s|sh|sho|shoes/; #matches “s” “shoes” =~ /shoes|sho|s/; #matches “cats”
20
Perl Resource 5: Perl Regular Expressions Tutorial ◦ http://www.cs.drexel.edu/~knowak/cs265_fall_201 0/perlretut_2007.pdf http://www.cs.drexel.edu/~knowak/cs265_fall_201 0/perlretut_2007.pdf Perl History ◦ http://www.xmluk.org/perl-cgi-history- information.htm http://www.xmluk.org/perl-cgi-history- information.htm Perl Special Variables ◦ http://www.kichwa.com/quik_ref/spec_variables.ht ml
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.