CS 330 Programming Languages 10 / 02 / 2007 Instructor: Michael Eckmann.

Slides:



Advertisements
Similar presentations
Perl & Regular Expressions (RegEx)
Advertisements

Regular Expressions in Perl By Josue Vazquez. What are Regular Expressions? A template that either matches or doesn’t match a given string. Often called.
1 Chapter 2 Introduction to Java Applications Introduction Java application programming Display ____________________ Obtain information from the.
COMP234 Perl Printing Special Quotes File Handling.
CS 106 Introduction to Computer Science I 02 / 18 / 2008 Instructor: Michael Eckmann.
CS 106 Introduction to Computer Science I 09 / 25 / 2006 Instructor: Michael Eckmann.
CS 330 Programming Languages 10 / 14 / 2008 Instructor: Michael Eckmann.
Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.
CS 106 Introduction to Computer Science I 02 / 04 / 2008 Instructor: Michael Eckmann.
CS 898N – Advanced World Wide Web Technologies Lecture 8: PERL Chin-Chih Chang
Scripting Languages Chapter 6 I/O Basics. Input from STDIN We’ve been doing so with $line = chomp($line); Same as chomp($line= ); line input op gives.
ISBN Chapter 6 Data Types Character Strings Pattern Matching.
CS 330 Programming Languages 10 / 11 / 2007 Instructor: Michael Eckmann.
Linux+ Guide to Linux Certification, Second Edition
CS 106 Introduction to Computer Science I 02 / 12 / 2007 Instructor: Michael Eckmann.
CS 106 Introduction to Computer Science I 02 / 11 / 2008 Instructor: Michael Eckmann.
CS 330 Programming Languages 10 / 10 / 2006 Instructor: Michael Eckmann.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
More Regular Expressions. List/Scalar Context for m// Last week, we said that m// returns ‘true’ or ‘false’ in scalar context. (really, 1 or 0). In list.
Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl Linux editors and commands (e.g.
Guide To UNIX Using Linux Third Edition
CS 106 Introduction to Computer Science I 09 / 28 / 2007 Instructor: Michael Eckmann.
Scripting Languages Chapter 8 More About Regular Expressions.
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
Last Updated March 2006 Slide 1 Regular Expressions.
Lecture 7: Perl pattern handling features. Pattern Matching Recall =~ is the pattern matching operator A first simple match example print “An methionine.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
Lecture 8 perl pattern matching features
Week 7 Working with the BASH Shell. Objectives  Redirect the input and output of a command  Identify and manipulate common shell environment variables.
CS 330 Programming Languages 09 / 25 / 2008 Instructor: Michael Eckmann.
Regular Expressions in Perl Part I Alan Gold. Basic syntax =~ is the matching operator !~ is the negated matching operator // are the default delimiters.
Intro and Review Welcome to Java. Introduction Java application programming Use tools from the JDK to compile and run programs. Videos at
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Introduction to Bash Programming Ellen Zhang. Previous three classes What have we learnt so far ?
Linux+ Guide to Linux Certification, Third Edition
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
CS 330 Programming Languages 10 / 07 / 2008 Instructor: Michael Eckmann.
 Pearson Education, Inc. All rights reserved Introduction to Java Applications.
CS 106 Introduction to Computer Science I 01 / 31 / 2007 Instructor: Michael Eckmann.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
Regular Expressions for PHP Adding magic to your programming. Geoffrey Dunn
CS346 Regular Expressions1 Pattern Matching Regular Expression.
Introduction to Matlab Module #4 Page 1 Introduction to Matlab Module #4 – Programming Topics 1.Programming Basics (fprintf, standard input) 2.Relational.
Artificial Intelligence Lecture No. 26 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
R EGULAR E XPRESSION IN P ERL (P ART 1) Thach Nguyen.
1 Perl, Beyond the Basics: Regular Expressions, Subroutines, and Objects in Perl CSCI 431 Programming Languages Fall 2003.
2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)
CIT 383: Administrative ScriptingSlide #1 CIT 383: Administrative Scripting Regular Expressions.
Karthik Sangaiah.  Developed by Larry Wall ◦ “There’s more than one way to do it” ◦ “Easy things should be easy and hard things should be possible” 
CPTG286K Programming - Perl Chapter 1: A Stroll Through Perl Instructor: Denny Lin.
CS 106 Introduction to Computer Science I 09 / 26 / 2007 Instructor: Michael Eckmann.
CGS – 4854 Summer 2012 Web Site Construction and Management Instructor: Francisco R. Ortega Chapter 5 Regular Expressions.
Standard Types and Regular Expressions CS 480/680 – Comparative Languages.
Prof. Alfred J Bird, Ph.D., NBCT Door Code for IT441 Students.
Linux+ Guide to Linux Certification, Second Edition
Chapter 4 © 2009 by Addison Wesley Longman, Inc Pattern Matching - JavaScript provides two ways to do pattern matching: 1. Using RegExp objects.
CS 106 Introduction to Computer Science I 09 / 10 / 2007 Instructor: Michael Eckmann.
Introduction to Programming the WWW I CMSC Winter 2003 Lecture 17.
CS 106 Introduction to Computer Science I 01 / 24 / 2007 Instructor: Michael Eckmann.
Pattern Matching: Simple Patterns. Introduction Programmers often need to scan a file, directory, etc. for a specific substring. –Find all files that.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
Linux Administration Working with the BASH Shell.
Regular Expressions Copyright Doug Maxwell (
Introduction to C++ Programming
CSCI 431 Programming Languages Fall 2003
EECE.2160 ECE Application Programming
Perl Regular Expressions – Part 1
Presentation transcript:

CS 330 Programming Languages 10 / 02 / 2007 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS Fall 2007 Today’s Topics Questions / comments? more on Perl –an exercise –start on regular expressions

Perl Michael Eckmann - Skidmore College - CS Fall 2007 First, everyone log in, open a terminal window type the following: –mkdir cs330 –cd cs330 Save the input_file_for_perl.txt into your cs330 directory. and bring up a text editor (e.g. either emacs or gedit) by typing in the terminal window: –gedit & –enter this as the first line in your file: #!/usr/bin/perl

Perl Michael Eckmann - Skidmore College - CS Fall 2007 Continue and add this line below the shebang line: print “Hello World\n”; save the file as –helloWorld.pl at the terminal window do the following: chmod 755 helloWorld.pl (makes your script executable)./helloWorld.pl (executes your script using shebang line) OR perl helloWorld.pl (executes your script explicitly using perl)

Perl Michael Eckmann - Skidmore College - CS Fall 2007 Let's try the following as an exercise: –read each line of input from the file on the notes page input_file_for_perl.txt until the line that has “eof” is read –store each line in an element of an array (with the exception of the last line “eof”.) –store the array into a hash –Then, in a loop, ask the user for input (from STDIN) for a product and output its price by looking it up in the hash. Do this until user enters some sentinel that you make him/her aware of at the beginning. Note: the format of the file consists of a product name on one line and its price on the next line, then another product name on the next line and it's price on the next line and so on until the last line of the file which contains the word: eof

Perl Michael Eckmann - Skidmore College - CS Fall 2007 Matching string basics =~ (matches) m/ / (this is the format of match regular expression) Note: you put the regular expression in between the / / The format of a complete expression which will evaluate to true or false depending on whether the string matched or not (anywhere within the string) is as follows: ($some_string =~ m/RegExprToMatch/) Examples: (“CS330 Programming Languages” =~ m/Prog/) # true (“CS330 Programming Languages” =~ m/CS106/) # false

Perl Michael Eckmann - Skidmore College - CS Fall 2007 Matching string basics !~ (doesn't match) Examples: (“CS330 Programming Languages” !~ m/Prog/) # false (“CS330 Programming Languages” !~ m/CS106/) # true You may use any other character as a delimiter other than the /'s if you wish. e.g. m#Prog# could be used instead of m/Prog/ The m for the m/ / matching operator is optional if we use the /'s as the delimiters. To match a / you have two choices, either use a different delimiter, then just put the slash in there, or backslash the slash to make it an escape sequence.

Perl Michael Eckmann - Skidmore College - CS Fall 2007 More information: Variables can be put inside the / / search pattern and they are interpolated. =~ can be omitted if matching the $_ special default variable Metacharacters {}[]()^$.|*+?\ have special meaning inside the m/ /. If you wish to match those characters you need to backslash them. Also, regardless of the delimiter, if you which to match what you used as the delimiter, guess what you need to do? ^ forces the match to be required to be at the very beginning of the string $ forces the match to be required to be at the very end of the string Examples: (“CS330 Programming Languages” =~ m/^Prog/) # false, not at beginning (“CS330 Programming Languages” =~ m/ges$/) # true, ges is at the end (“CS330 Programming Languages” =~ m/^ges$/) # false, ges not the whole string (“CS330” =~ m/^CS330$/) # true, CS330 is the complete string

Perl Michael Eckmann - Skidmore College - CS Fall 2007 More information: Character classes are put between [ ] square brackets within the pattern --- this means any one character in the [ ] is able to match. May use ^ (not) to not match a char inside the character class May use – (hyphen) to specify a range of characters for the class “Hay is for horses” =~ m/[A-Z]ay/; # matches if the string contains any capital letter followed by ay “The boardgame Payday is lame” =~ m/[A-Z]ay/; # this would be true too, it matches Pay “The boardgame Payday is lame” =~ m/[^A-Z]ay/; # this would be true too, why?

Perl Michael Eckmann - Skidmore College - CS Fall 2007 More information: Character classes are put between [ ] square brackets within the pattern --- this means any one character in the [ ] is able to match. May use ^ (not) to not match a char inside the character class May use – (hyphen) to specify a range of characters for the class “Hay is for horses” =~ m/[A-Z]ay/; # matches if the string contains any capital letter followed by ay “The boardgame Payday is lame” =~ m/[A-Z]ay/; # this would be true too, it matches Pay “The boardgame Payday is lame” =~ m/[^A-Z]ay/; # this would be true too, why? It matches day These are special variables that are set for a match: $` $& and $' (left, matched, right) Let's put this code in a program and print out the variable values to see what matches.

Perl Michael Eckmann - Skidmore College - CS Fall 2007 There are ways to specify common character classes \d (any digit) \s (any whitespace \ \t\r\n\f \w (any “word” character (a digit, letter or underscore)) \D (any non-digit) \S (any non-whitespace) \W (any non-word character). (any character other than newline \n) These can be used within the square brackets or without.

Perl Michael Eckmann - Skidmore College - CS Fall 2007 Modifiers are characters that go after the second forward slash i is a modifier for ignore case. The behaviour for no modifier (the default) is that. Matches any non-newline character ^ matches at beginning of string $ matches at end of string (or before a newline at end) s modifier: treats the string as a single long line, so. matches any character including newline m modifier: treats string as multiple lines so, ^ and $ match the beginning or end of any line But now, \A matches the beginning of the whole string, \Z matches the end of the whole string. Let’s look at this webpage’s examples under Using Character Classes for some more examples:

Perl Michael Eckmann - Skidmore College - CS Fall 2007 | alternation character (acts sort of like a logical or) Grouping characters using the parentheses Getting the “submatches” by using the $1, $2, $3, etc. variables which are set via the parentheses. Repetitions ? - 0 or 1 time * - 0 or more times or more times { } – min and max, at least or exactly Let’s continue looking at this site for examples:

Perl Michael Eckmann - Skidmore College - CS Fall 2007 Recap on the special variables we learned $_ $` $& and $' (left, match, right) $0 (program name) $1, $2, $3,... (the submatches)

Perl Michael Eckmann - Skidmore College - CS Fall 2007 Let’s look at a larger parsing example using many of the features we just learned: The “doing string selections” section of: The following page is a good page for reference. It is a nice summary of the different characters and their meanings with succinct examples: