Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.

Similar presentations


Presentation on theme: "CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann."— Presentation transcript:

1 CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann

2 Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Today’s Topics Questions / comments? Anyone try the tutorial? I highly recommend you do if you haven't. Perl –Exercise –Regular expressions

3 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Let's go to the Linux lab and do an exercise and some more lecture.

4 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 First, everyone log in, open a terminal window type the following: –mkdir cs330 –cd cs330 Save the input_file_for_perl.txt into your cs330 directory. and bring up a text editor (e.g. emacs or gedit) by typing in the terminal window: –gedit & –enter this as the first line in your file: #!/usr/bin/perl

5 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Continue and add this line below the shebang line: print “Hello World\n”; save the file as –helloWorld.pl at the terminal window do the following: chmod 755 helloWorld.pl (makes your script executable)‏./helloWorld.pl (executes your script using shebang line)‏ OR perl helloWorld.pl (executes your script explicitly using perl)

6 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Let's try the following as an exercise: –read each line of input from the file on the notes page input_file_for_perl.txt until the line that has “eof” is read –store each line in an element of an array (with the exception of the last line “eof”.)‏ –store the array into a hash (can be done with one line of code)‏ –Then, in a loop, ask the user for input (from STDIN) for a product and output its price by looking it up in the hash. Do this until user enters some sentinel that you make him/her aware of at the beginning. Note: the format of the file consists of a product name on one line and its price on the next line, then another product name on the next line and it's price on the next line and so on until the last line of the file which contains the word: eof

7 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Matching string basics =~ (matches)‏ m/ / (this is the format of match regular expression)‏ Note: you put the regular expression in between the / / The format of a complete expression which will evaluate to true or false depending on whether the string matched or not (anywhere within the string) is as follows: ($some_string =~ m/RegExprToMatch/)‏ Examples: (“CS330 Programming Languages” =~ m/Prog/) # true (“CS330 Programming Languages” =~ m/CS106/) # false

8 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Matching string basics !~ (doesn't match)‏ Examples: (“CS330 Programming Languages” !~ m/Prog/) # false (“CS330 Programming Languages” !~ m/CS106/) # true You may use any other character as a delimiter other than the /'s if you wish. e.g. m#Prog# could be used instead of m/Prog/ The m for the m/ / matching operator is optional if we use the /'s as the delimiters. To match a / you have two choices, either use a different delimiter, then just put the slash in there, or backslash the slash to make it an escape sequence.

9 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 More information: Variables can be put inside the / / search pattern and they are interpolated. =~ can be omitted if matching the $_ special default variable Metacharacters {}[]()^$.|*+?\ have special meaning inside the m/ /. If you wish to match those characters you need to backslash them. Also, regardless of the delimiter, if you which to match what you used as the delimiter, guess what you need to do? ^ forces the match to be required to be at the very beginning of the string $ forces the match to be required to be at the very end of the string Examples: (“CS330 Programming Languages” =~ m/^Prog/) # false, not at beginning (“CS330 Programming Languages” =~ m/ges$/) # true, ges is at the end (“CS330 Programming Languages” =~ m/^ges$/) # false, ges not the whole string (“CS330” =~ m/^CS330$/) # true, CS330 is the complete string

10 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 More information: Character classes are put between [ ] square brackets within the pattern --- this means any (exactly) one character in the [ ] is able to match. May use ^ (not) to not match a char inside the character class May use – (hyphen) to specify a range of characters for the class “Hay is for horses” =~ m/[A-Z]ay/; # matches if the string contains any capital letter followed by ay “The boardgame Payday is lame” =~ m/[A-Z]ay/; # this would be true too, it matches Pay “The boardgame Payday is lame” =~ m/[^A-Z]ay/; # this would be true too, why?

11 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 More information: Character classes are put between [ ] square brackets within the pattern --- this means any (exactly) one character in the [ ] is able to match. May use ^ (not) to not match a char inside the character class May use – (hyphen) to specify a range of characters for the class “Hay is for horses” =~ m/[A-Z]ay/; # matches if the string contains any capital letter followed by ay “The boardgame Payday is lame” =~ m/[A-Z]ay/; # this would be true too, it matches Pay “The boardgame Payday is lame” =~ m/[^A-Z]ay/; # this would be true too, why? It matches day These are special variables that are set for a match: $` $& and $' (left, matched, right)‏ Let's put this code in a program and print out the variable values to see what matches.

12 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 There are ways to specify common character classes \d (any digit)‏ \s (any whitespace \ \t\r\n\f \w (any “word” character (a digit, letter or underscore))‏ \D (any non-digit)‏ \S (any non-whitespace)‏ \W (any non-word character)‏. (any character other than newline \n)‏ These can be used within the square brackets or without.

13 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Modifiers are characters that go after the second forward slash i is a modifier for ignore case. The behaviour for no modifier (the default) is that. Matches any non-newline character ^ matches at beginning of string $ matches at end of string (or before a newline at end)‏ s modifier: treats the string as a single long line, so. matches any character including newline m modifier: treats string as multiple lines so, ^ and $ match the beginning or end of any line But now, \A matches the beginning of the whole string, \Z matches the end of the whole string. Let’s look at this webpage’s examples under Using Character Classes for some more examples: http://www.cs.rit.edu/~afb/20013/plc/perl5/doc/perlretut.html

14 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 | alternation character (acts sort of like a logical or)‏ Grouping characters using the parentheses Getting the “submatches” by using the $1, $2, $3, etc. variables which are set via the parentheses. Repetitions ? - 0 or 1 time * - 0 or more times + - 1 or more times { } – min and max, at least or exactly { min, max } - match >=min times and at most max times. { min, } - match >=min times { n} - match n times exactly

15 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Curly braces { } – min and max, at least or exactly { min, max } - match >=min times and at most max times. { 5, 10 } - matches between 5 and 10 times inclusive { min, } - match >=min times {3, } - matches 3 or more times { n} - match n times exactly { 6 } - matches exactly 6 times Examples of a repetition quantier after a grouping and after a character m/(the){3}/ this will match thethethe all consecutively. m/the{3}/ this will match theee (only the e is repeated 3 times)‏ m/the.*the.*the/ This will match 3 the’s with any characters (except \n) btwn them Any other way to write it?

16 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 In terms of regular expression repetition quantifiers, what does greedy mean again?

17 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 In terms of regular expression repetition quantifiers, what does greedy mean? A quantifier is greedy if it matches as much of the string as possible while still allowing the whole regular expression to match. We'll see that greediness in action now. Let’s continue looking at this site for examples of matching repetitions and the 4 principles that are followed: http://www.cs.rit.edu/~afb/20013/plc/perl5/doc/perlretut.html

18 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Recap on the special variables we learned $_ $` $& and $' (left, match, right)‏ $0 (program name)‏ $1, $2, $3,... (the submatches)‏

19 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2005 Let's write a few regular expressions. match any signed or unsigned integers of arbitrary length. e.g. it should match –-22 –4567 –1 –+43 but not things like: –- –+ –4.56 –abcd –etc.

20 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2005 Let's try these: 1) ignore beginning whitespace if there is any, and match the word program and store the rest of the string (after the word program) into some variable. 2) Now what if there were \n's in the string? What might we change? 3) cs330 or cs106 or CS106 or CS330 but not Cs330, or cS106 etc.

21 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2005 1) ignore beginning whitespace if there is any, and match the word program and store the rest of the string (after the word program) into some variable. m/\s*program(.*)/ 2) Now what if there were \n's in the string? What might we change? m/\s*program(.*)/s 3) cs330 or cs106 or CS106 or CS330 but not Cs330, or cS106 etc. m/cs330|cs106|CS330|CS106/ OR m/(cs|CS)(106|330)/

22 Perl Michael Eckmann - Skidmore College - CS 330 - Fall 2008 Let’s look at a larger parsing example using many of the features we just learned. We'll read the problem and try to solve it ourselves before looking at the solution. The “doing string selections” section of: http://www.troubleshooters.com/codecorn/littperl/perlreg.htm The following page is a good page for reference. It is a nice summary of the different characters and their meanings with succinct examples: http://www.cs.tut.fi/~jkorpela/perl/regexp.html


Download ppt "CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann."

Similar presentations


Ads by Google