Programming Perl in UNIX Course Number : CIT 370 Week 4 Prof. Daniel Chen.

Slides:



Advertisements
Similar presentations
Perl & Regular Expressions (RegEx)
Advertisements

Regular Expressions in Perl By Josue Vazquez. What are Regular Expressions? A template that either matches or doesn’t match a given string. Often called.
Regular Expression Original Notes by Song Guo. What Regular Expressions Are Exactly - Terminology a regular expression is a pattern describing a certain.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
COS 381 Day 19. Agenda  Assignment 5 Posted Due April 7  Exam 3 which was originally scheduled for Apr 4 is going to on April 13 XML & Perl (Chap 8-10)
PERL Part 3 1.Subroutines 2.Pattern matching and regular expressions.
Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){
More Regular Expressions. List/Scalar Context for m// Last week, we said that m// returns ‘true’ or ‘false’ in scalar context. (really, 1 or 0). In list.
Regular Expressions. What are regular expressions? A means of searching, matching, and replacing substrings within strings. Very powerful (Potentially)
COS 381 Day 22. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
Regular Expression A regular expression is a template that either matches or doesn’t match a given string.
Bioinformatics is … - the use of computers and information technology to assist biological studies - a multi-dimensional and multi-lingual discipline Chapters.
Lecture 7: Perl pattern handling features. Pattern Matching Recall =~ is the pattern matching operator A first simple match example print “An methionine.
Language Recognizer Connecting Type 3 languages and Finite State Automata Copyright © – Curt Hill.
Regular Expression Darby Tien-Hao Chang (a.k.a. dirty) Department of Electrical Engineering, National Cheng Kung University.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2015, Fred McClurg, All Rights.
1 Perl Syntax: control structures Learning Perl, Schwartz.
BY Sandeep Kumar Gampa.. What is Regular Expression? Regex in.NET Regex Language Elements Examples Regular Expression API How to Test regex in.NET Conclusion.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
CPTG286K Programming - Perl Chapter 7: Regular Expressions.
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2010 All Rights Reserved. 1.
Regular Expressions. Overview Regular expressions allow you to do complex searches within text documents. Examples: Search 8-K filings for restatements.
Regular Expressions in Perl CS/BIO 271 – Introduction to Bioinformatics.
Introduction to Unix – CS 21
©Brooks/Cole, 2001 Chapter 9 Regular Expressions ( 정규수식 )
©Brooks/Cole, 2001 Chapter 9 Regular Expressions.
20-753: Fundamentals of Web Programming 1 Lecture 10: Server-Side Scripting II Fundamentals of Web Programming Lecture 10: Server-Side Scripting II.
Programming Perl in UNIX Course Number : CIT 370 Week 6 Prof. Daniel Chen.
May 2008CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.
R EGULAR E XPRESSION IN P ERL (P ART 1) Thach Nguyen.
1 Perl, Beyond the Basics: Regular Expressions, Subroutines, and Objects in Perl CSCI 431 Programming Languages Fall 2003.
Strings and Patterns in Perl Ellen Walker Bioinformatics Hiram College.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong. Adminstrivia Homework 4 not yet graded …
CompSci 6 Introduction to Computer Science November 8, 2011 Prof. Rodger.
Programming Perl in UNIX Course Number : CIT 370 Week 3 Prof. Daniel Chen.
CSCI 330 UNIX and Network Programming Unit IV Shell, Part 2.
CGS – 4854 Summer 2012 Web Site Construction and Management Instructor: Francisco R. Ortega Chapter 5 Regular Expressions.
Standard Types and Regular Expressions CS 480/680 – Comparative Languages.
7 Copyright © 2009, Oracle. All rights reserved. Regular Expression Support.
Introduction to Programming the WWW I CMSC Winter 2004 Lecture 13.
8 1 String Manipulation CGI/Perl Programming By Diane Zak.
Programming Perl in UNIX Course Number : CIT 370 Week 2 Prof. Daniel Chen.
Variable Variables A variable variable has as its value the name of another variable without $ prefix E.g., if we have $addr, might have a statement $tmp.
Introduction to Programming the WWW I CMSC Winter 2003 Lecture 17.
Pattern Matching: Simple Patterns. Introduction Programmers often need to scan a file, directory, etc. for a specific substring. –Find all files that.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
May 2006CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
Regular Expressions Copyright Doug Maxwell (
Looking for Patterns - Finding them with Regular Expressions
CSC 594 Topics in AI – Natural Language Processing
Regular Expressions and perl
Regular Expression Beihang Open Source Club.
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong.
CSC 594 Topics in AI – Natural Language Processing
LING/C SC/PSYC 438/538 Lecture 10 Sandiway Fong.
CSCI 431 Programming Languages Fall 2003
Regular Expression: Pattern Matching
ADVANCE FIND & REPLACE WITH REGULAR EXPRESSIONS
LING/C SC/PSYC 438/538 Lecture 12 Sandiway Fong.
Presentation transcript:

Programming Perl in UNIX Course Number : CIT 370 Week 4 Prof. Daniel Chen

Introduction n Review and Overviews n Chapters 7 and 8 n Summary n Lab n Mid-term Exam n Next Week (Week 5)

Topics of Discussion n What Is a Regular Expression? n Expression Modifiers and Simple Statements n Regular Expression Operators n Regular Expression Metacharacters n Unicode

Chapter 7: Regular Expressions – Pattern Matching n 7.1 What Is a Regular Expression? n 7.2 Expression Modifiers and Simple Statements n 7.3 Regular Expression Operators

7.1 What Is a Regular Expression/ n A regular expression is really just a sequence or pattern of characters that is matched against a string of text when performing searches and replacements. n Example: 7.1 /abc/ ?abc?

7.2 Expression Modifiers and Simple Statements n Conditional Modifiers u The if Modifier u Format: Expression2 if Expression1; u Examples: 7.2, 7.3, 7.4 n The DATA Filehandle u Format: __DATA__ The actual data is stored here u Examples: 7.5, 7.6

7.2 Expression Modifiers and Simple Statements n The unless Modifier u Format: Expression2 unless Expression1; u Examples: 7.7, 7.8 n Looping Modifiers u The while Modifier u Format: Expression2 while Expression1; u Examples: 7.9 n The Until Modifier u Example: 7.10 n The foreach Modifier u Example: 7.11

7.3 Regular Expression Operators n The m Operator and Matching u Format:/Regular Expression/ m#Regular Expression# m(regular expression) u Table 7.1 u Examples: 7.12, 7.13, 7.14, 7.15, 7.16

7.3 Regular Expression Operators n The g Modifier-Global Match u Format:m/search pattern/g u Example: 7.17 n The i Modifier-case Insensitivity u Format:m/search pattern/i u Example: 7.18 n Special Scalars for Saving patterns u Example: 7.19 n The x Modifier-Global Match u Example: 7.20

7.3 Regular Expression Operators n The s Operator and Substitution u Format:s/old/new/; s/old/new/I; s/old/new/g; u Table 7-2 u Examples: 7.21, 7.22, 7.23 n Changing the Substitution Delimiters u Example: 7.24, 7.25 n The g Modifier-Global Substitution u Examples: 7.26, 7.27

7.3 Regular Expression Operators n The I Modifier-Case Insensitivity u Format: s/search pattern/replacement string/i; u Examples: 7.28, 7.29 n The e Modifier-Evaluating An Expression u Format: s/search pattern/replacement string/e; u Examples: 7.30, 7.31, 7.32, 7.33 n Pattern Binding Operators u Format: variable = ~ /Expression/ variable !~ /Expression/ Variable =~ s/old/new u Table 7.3 u Examples: 7.34, 7.35, 7.36, 7.37, 7.38, 7.39

Chapter 8: Getting Control – Regular Expression Metacharacters n 8.1 Regular Expression Metacharacters n 8.2 Unicode

8.1 Regular Expression Metacharacters n Regular expression metacharacters are characters that do not represent themselves. They are endowed with special powers to allow you to control the search pattern in some way. n Metacharacters lose their special meaning if proceeded with a backslash(\). n Metasymbols – [0-9] = \d n Example: 8.1 /^a…c/ n Table 8.1

8.1 Regular Expression Metacharacters n Metacharacters for Single Characters u Table 8.2 u Example: 8.2 n The s Modifier-The Dot metacharacter and the newline u Example: 8.3 n The Character Class u A character class represents one character from a set of characters. u Examples: 8.4, 8.5, 8.6, 8.7, 8.8, 8.9

8.1 Regular Expression Metacharacters n The POSIX Character Class u POSIT (the Portable Operating System Interface) is an industry standard used to ensure that programs are portable across operating system. u Table 8.3 u Example 8.11 n Whitespace Metacharacters u Table 8.4 u Examples: 8.12, 8.13, 8.14

8.1 Regular Expression Metacharacters n Metacharacters to Repeat Pattern matches u Quantifier – One or more characters u The Greed Factor – the asterisk (*) F It matches for zero or more of the preceding character. F $-=“ab AB” s/ab[0-9]*/X/; XAB u Table 8.5 u Example 8.15, 8.16, 8.17, 8.18, 8.19, 8.20, 8.21, 8.22 n Metacharacters That Turn Off Greediness u By pacing a question mark after a greedy quantifier, the greed is turned off and the search ends after the first match, rather the last one. u Table 8.6 u Examples: 8.24

8.1 Regular Expression Metacharacters n Anchoring Metacharacters u Zero-width assertions – Anchors correspond to positions, not actual characters. u Table 8.7 u Example 8.25, 8.26, 8.27, 8.28 n The m Modifier u The m modifier is used to control the behavior of the $ and ^ anchor metacharacters. u Examples: 8.29 n Alternation u Alternation allows the regular expression to contain alternative pattern to be matched, u Example 8.30

8.1 Regular Expression Metacharacters n Grouping or Clustering u The process of grouping characters together is called clustering. u Example 8.31, 8.32, 8.33, 8.34 n Remembering or Capturing u Subpattern – If the regular expression pattern is enclosed in parentheses. The subpattern is saved in special numbered scalar variables, and these variables can be used later in the programs. u The process of grouping characters together is called clustering. u Example 8.35, 8.36, 8.37, 8.38, 8.39, 8.40, 8.42

8.1 Regular Expression Metacharacters n Turning Off Capturing u ?: metacharacter can be used to suppress the capturing of the subpattern. u Example 8.43 n Metacharacters That Look Ahead and Behind u Look ahead in the string for a pattern (?=pattern) u Look behind in the string for a pattern (?<=pattern) u Table 8.8 u Example 8.44, 8.45, 8.46, 8.47

8.1 Regular Expression Metacharacters n The tr or y Function u The tr function translates characters, in a one-on- one correspondence, from the characters in the search string to the characters in the replacement string. u Table 8.9 u Example 8.48 u Example 8.49 (tr Delete Option) u Example 8.50 (tr Complement Option) u Example 8.51 (tr Squeeze Option)

8.2 Unicode n The Unicode standard is an effort to solve the problem by creating new character sets, called UTF8 and UTF16. n Unicode has the capacity to encompass all the world’s written language. n Perl and Unicode u Perl 5.6 supports UTF8 Unicode u The utf8 program turns on the Unicode settings and the bytes programs turn off. u Table 8.10 u Example 8.52

Summary n What Is a Regular Expression? n Expression Modifiers and Simple Statements n Regular Expression Operators n Regular Expression metacharacters n Unicode

Lab n Examples 7.1 – 7.39 (P 163 – 195) n Examples (P ) n Homework 4

Mid-term Exam n Date: Next week n Exam Time: 11:00 AM - 11:30 AM n Contents: Chapter 1- Chapter n No books, no notes, no computer

Next Week n Reading assignment (Textbook chapter and Chapter 9) n Mid-term Exam (Chapter 1 – Chapter 8.1.2)