Presentation is loading. Please wait.

Presentation is loading. Please wait.

Karthik Sangaiah.  Developed by Larry Wall ◦ “There’s more than one way to do it” ◦ “Easy things should be easy and hard things should be possible” 

Similar presentations


Presentation on theme: "Karthik Sangaiah.  Developed by Larry Wall ◦ “There’s more than one way to do it” ◦ “Easy things should be easy and hard things should be possible” "— Presentation transcript:

1 Karthik Sangaiah

2  Developed by Larry Wall ◦ “There’s more than one way to do it” ◦ “Easy things should be easy and hard things should be possible”  Main purpose of Perl was for text manipulation  Regular Expressions fundamental to text processing

3  String that describes a pattern  Simplest regex is a word  A regex consisting of a word matches any string that contains that word  Ex: ◦ “Hello World” =~ /World/

4  “=~” operator produces TRUE if regex matches a string  Ex: ◦ if (“Sample Words” =~ /Sample/) { print “It matches\n”; else { print “It doesn’t match\n”; }  “!~” operator produces TRUE of regex does NOT match a string  Ex: ◦ if (“Sample Words” !~ /Sample/) { print “It doesn’t match\n”; else { print “It matches\n”; }

5  Can use variable as regex  Ex: $temp = “ls” “ls - l” =~ /$temp/  If using default variable “$_”: ◦ “$_ =~” can be omitted  Ex: $_ = “ls -l”; if (/ls/) { print “It matches\n”; } else { print “It doesn’t match\n”; }

6  Regexs in Perl are mostly treated as double- quoted Strings  Values of variables in regex will be subtituted in before regex is evaluated for matching  Ex: $foo = ‘vision’; ‘television’ =~ /tele$foo/;

7  “/ /” default delimiters can be changed to arbitrary delimiters by using “=~ m”  Ex: “Sample Text” =~ m!Text!; “Sample Text” =~ m{Text}; “Sample Text” =~ m“Text”;

8  Reserved for use in regex notations ◦ { }, [ ], ( ), ^, $,., |, *, +, ?, \  Need to use “\” before use of a metacharacter in the regex  Ex: ◦ “5*2=10" =~ /5\*2/; ◦ "/usr/bin/perl" =~ /\/usr\/bin\/perl/;  “/” also needs to be backslashed if it’s used as the delimiter

9  “^” matches at beginning of string  “$” matches at end of string or before new line at end of string  Ex: “television” =~ /^tele/; “television” =~ /vision$/;  When using “^” and “$”, regex has to match in beginning and end of string (i.e. match whole string).  Ex: “vision” =~ /^vision$/;

10  Allows a set of possible characters, rather than a single character to match  Character classes denoted by […] with a set of characters matched inside  Ex. /[btc]all/; #Matches ball, tall, or call /word[0123456789]/; #Matches word0…word9

11  Special characters in character class are handled with backslash as well  Special characters within character class: ◦ “-”, “]”, “\”, “^”, “$”, “.”, “]”  Ex: /[\$c]w/; #matches $w or cw $x = ‘btc’; /[$x]all/; #matches ball, tall, or call /[\$x]all/; #matches $all or xall /[\\$x]all/; #matches \all, ball, tall, or call

12  Special Char. “-” used as range operator  Ex: /word[0-9]/; #matches word0…word9 /word[0-9a-z] /; #matches word0… word9, or worda… wordz  Special Char. “^” in first position of character class denotes a negated character class  Ex: /[^0-9]/; #matches a non-numeric character

13  Common character class abbreviations: ◦ \d – digit, [0-9] ◦ \s – whitespace character, [\ \t\r\n\f] ◦ \w – word character(alphanumeric or _), ◦ \D – negated \d ◦ \S – negated \s ◦ \W – negated \w ◦. – any character but “\n”  Abbreviations can be used inside and outside character classes

14  “\b” matches boundary between a word character and a non-word character  Ex: ◦ $x = “Exam1 Question from Sample Exam”; ◦ $x =~ /Exam/; #matches Exam in Exam1 ◦ $x =~ /\bExam/; #matches cat in Exam ◦ $x =~ /\bExam\b/; #matches cat at end of string

15  Often, we want to match against lines and ignore newline characters  Sometimes we need to keep track of newlines.  //s – Single line matching  //m – Multi-line matching  These modifiers affect two aspects how the regex is interpreted: ◦ How the ‘.’ character class is defined ◦ Where the anchor, ^ and $, are able to match

16  No modifier (//) – Default ◦. matches all characters but \n ◦ ^ matches at beginning of string ◦ $ matches at end of string or before a newline at the end of string  String as Single long line (//s) ◦. matches any character ◦ ^ matches at beginning of string ◦ $ matches end of string or before a newline at the end of string

17  String as Multiple lines (//m) ◦. matches all characters but \n ◦ ^ matches at beginning of any line within the string ◦ $ matches end of any line within the string  String as Single long line but detect mutliple lines (//sm) ◦. matches any character ◦ ^ matches at beginning of any line within the string ◦ $ matches end of any line within the string

18  $x = “You will know how to use Perl\nFor text processing\n";  $x =~ /^For/; # No match, “For" not at start of string  $x =~ /^For/s; # No match, “For" not at start of string  $x =~ /^For/m; # match, “For" at start of second line  $x =~ /^For/sm; # match, “For" at start of second line

19  Alternation metacharacter “|” ◦ Used to match different possible words or character strings ◦ Word 1 or word 2 -> /word1|word2/;  Perl tries to match the regex at earliest possible point in the string  Ex. “shoes and strings” =~ /shoes/strings/and/; #matches shoes “shoes” =~ /s|sh|sho|shoes/; #matches “s” “shoes” =~ /shoes|sho|s/; #matches “cats”

20  Perl Resource 5: Perl Regular Expressions Tutorial ◦ http://www.cs.drexel.edu/~knowak/cs265_fall_201 0/perlretut_2007.pdf http://www.cs.drexel.edu/~knowak/cs265_fall_201 0/perlretut_2007.pdf  Perl History ◦ http://www.xmluk.org/perl-cgi-history- information.htm http://www.xmluk.org/perl-cgi-history- information.htm  Perl Special Variables ◦ http://www.kichwa.com/quik_ref/spec_variables.ht ml

21


Download ppt "Karthik Sangaiah.  Developed by Larry Wall ◦ “There’s more than one way to do it” ◦ “Easy things should be easy and hard things should be possible” "

Similar presentations


Ads by Google