Regular Expressions – Friend or Foe? Primož Gabrijelčič.

Slides:



Advertisements
Similar presentations
2-1. Today’s Lecture Review Chapter 4 Go over exercises.
Advertisements

Parallel Programming with OmniThreadLibrary
REGULAR EXPRESSIONS FRIEND OR FOE?. INTRODUCTION TO REGULAR EXPRESSIONS.
Regular Expressions (in Python). Python or Egrep We will use Python. In some scripting languages you can call the command “grep” or “egrep” egrep pattern.
AND FINITE AUTOMATA… Ruby Regular Expressions. Why Learn Regular Expressions? RegEx are part of many programmer’s tools  vi, grep, PHP, Perl They provide.
Regular Expression Original Notes by Song Guo. What Regular Expressions Are Exactly - Terminology a regular expression is a pattern describing a certain.
1 CSE 390a Lecture 7 Regular expressions, egrep, and sed slides created by Marty Stepp, modified by Jessica Miller and Ruth Anderson
1 CSE 303 Lecture 7 Regular expressions, egrep, and sed read Linux Pocket Guide pp , 73-74, 81 slides created by Marty Stepp
1 CSE 390a Lecture 7 Regular expressions, egrep, and sed slides created by Marty Stepp, modified by Jessica Miller
Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl Linux editors and commands (e.g.
Introduction to regular expression. Wéber André Objective of the training Scope of the course  We will present what are “regular expressions”
Form Validation CS What is form validation?  validation: ensuring that form's values are correct  some types of validation:  preventing blank.
slides created by Marty Stepp
Regex Wildcards on steroids. Regular Expressions You’ve likely used the wildcard in windows search or coding (*), regular expressions take this to the.
REGULAR EXPRESSIONS CHAPTER 14. REGULAR EXPRESSIONS A coded pattern used to search for matching patterns in text strings Commonly used for data validation.
Description of programming languages 1 Using regular expressions and context free grammars.
Regular Expression Darby Tien-Hao Chang (a.k.a. dirty) Department of Electrical Engineering, National Cheng Kung University.
Pattern matching with regular expressions A common file processing requirement is to match strings within the file to a standard form, e.g. address.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Regular Expressions in.NET Ashraya R. Mathur CS NET Security.
AND OTHER LANGUAGES… Ruby Regular Expressions. Why Learn Regular Expressions? RegEx are part of many programmer’s tools  vi, grep, PHP, Perl They provide.
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
INFO 320 Server Technology I Week 7 Regular expressions 1INFO 320 week 7.
RegExp. Regular Expression A regular expression is a certain way to describe a pattern of characters. Pattern-matching or keyword search. Regular expressions.
Regular Expressions Regular expressions are a language for string patterns. RegEx is integral to many programming languages:  Perl  Python  Javascript.
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Introduction to Regular Expressions Ben Brumfield THATCamp Texas 2011.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
Regular Expression in Java 101 COMP204 Source: Sun tutorial, …
Copyright ©2005  Department of Computer & Information Science JavaScript Regular Expressions.
Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2015, Fred McClurg, All Rights.
Perl: Lecture 2 Advanced RE & CGI. Regular Expressions 2.
BY Sandeep Kumar Gampa.. What is Regular Expression? Regex in.NET Regex Language Elements Examples Regular Expression API How to Test regex in.NET Conclusion.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2010 All Rights Reserved. 1.
Regular Expressions. Overview Regular expressions allow you to do complex searches within text documents. Examples: Search 8-K filings for restatements.
Regular Expressions for PHP Adding magic to your programming. Geoffrey Dunn
Satisfy Your Technical Curiosity Regular Expressions Roy Osherove Methodology & Team System Expert Sela Group The.
Python for NLP Regular Expressions CS1573: AI Application Development, Spring 2003 (modified from Steven Bird’s notes)
JavaScript, Part 2 Instructor: Charles Moen CSCI/CINF 4230.
©Brooks/Cole, 2001 Chapter 9 Regular Expressions ( 정규수식 )
12. Regular Expressions. 2 Motto: I don't play accurately-any one can play accurately- but I play with wonderful expression. As far as the piano is concerned,
©Brooks/Cole, 2001 Chapter 9 Regular Expressions.
Regular Expressions The ultimate tool for textual analysis.
CS346 Regular Expressions1 Pattern Matching Regular Expression.
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
Sys Prog & Scrip - Heriot Watt Univ 1 Systems Programming & Scripting Lecture 12: Introduction to Scripting & Regular Expressions.
20-753: Fundamentals of Web Programming 1 Lecture 10: Server-Side Scripting II Fundamentals of Web Programming Lecture 10: Server-Side Scripting II.
R EGULAR E XPRESSION IN P ERL (P ART 1) Thach Nguyen.
CSC 2720 Building Web Applications PHP PERL-Compatible Regular Expressions.
2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)
CIT 383: Administrative ScriptingSlide #1 CIT 383: Administrative Scripting Regular Expressions.
1 Validating user input is the bane of every software developer’s existence. When you are developing cross-browser web applications (IE4+ and NS4+) this.
CGS – 4854 Summer 2012 Web Site Construction and Management Instructor: Francisco R. Ortega Chapter 5 Regular Expressions.
Presented by: Elena C. Ciobanu Mihai V. Ciobanu Kuntal Ghosh
Regular Expressions /^Hel{2}o\s*World\n$/ SoftUni Team Technical Trainers Software University
An Introduction to Regular Expressions Specifying a Pattern that a String must meet.
Regular Expressions /^Hel{2}o\s*World\n$/ SoftUni Team Technical Trainers Software University
Pattern Matching: Simple Patterns. Introduction Programmers often need to scan a file, directory, etc. for a specific substring. –Find all files that.
OOP Tirgul 11. What We’ll Be Seeing Today  Regular Expressions Basics  Doing it in Java  Advanced Regular Expressions  Summary 2.
Computer Science I Split. Regular Expressions Classwork: Trivia questions. Share. Show (stage 1) final project. Homework: work on final project.
Regular Expressions In Javascript cosc What Do They Do? Does pattern matching on text We use the term “string” to indicate the text that the regular.
Regular Expressions ICCM 2017
Looking for Patterns - Finding them with Regular Expressions
Regular Expressions
CIT 383: Administrative Scripting
Regular Expression in Java 101
Presentation transcript:

Regular Expressions – Friend or Foe? Primož Gabrijelčič

programmer, consultant, speaker, trainer Delphi / Smart Mobile Studio Skype: gabr42 The Delphi Geek – Smart Programmer –

Introduction to regular expressions

Introduction “In computing, a regular expression provides a concise and flexible means to "match" (specify and recognize) strings of text, such as particular characters, words, or patterns of characters.” -Wikipedia “A regular expression is a set of pattern matching rules encoded in a string according to certain syntax rules.” -About.com

History Originated in the Unix world Many flavors – Perl, PCRE (PHP, Delphi),.NET, Java, JavaScript, Python, Ruby, Posix …

Usage Testing (matching) Searching Replacing Splitting

Limitations Slow(ish) Can use lots of time and memory Unsuitable for some purposes – HTML parsing UTF-8

Tools Editors grep/egrep/fgrep Online tools – regex.larsolavtorvik.com regex.larsolavtorvik.com RegexBuddy, RegexMagic –

Delphi RegularExpressions, RegularExpressionsCore – Since XE TPerlRegex – Up to 2010 PCRE flavor

Example Search for "Handel", "Händel", and "Haendel" – H(ä|ae?)ndel – Handel|Händel|Haendel if TRegEx.IsMatch(s,'H(ä|ae?)ndel') then

Syntax

Literals and Metacharacters Metacharacters – $()*+.?[\^{| Literals – Everything else Escape – \ Nonprintable – \n, \r

Tutorial w w.regular-expressions.info/delphi.html Jan Goyvaerts, Steven Levithan – Regular Expressions Cookbook (Amazon, O'Reilly)

Character class, Alternatives, Any One-of – [abc] – [a-fA-F0-9] – [^a-fA-F0-9] Alternatives – Delphi|Prism|FreePascal Any –.

Shorthands \d, \D – [0-9], [^0-9] \w, \W – [a-zA-Z0-9_], [^a-zA-Z0-9_] \s, \S – [ \t\r\n], [^ \t\r\n]

Anchors Start of line/text – ^, \A End of line/text – $, \Z, \z Word boundary – \b, \B

Unicode Single grapheme – \X Unicode codepoint – \x{2122} ™ \p{category} – \p{N} \p{script} – \p{Greek}

Groups Capturing group – (\d\d\d) Noncapturing group – (?:\d\d\d) Named group – (?P \d\d\d)

Group references Unnamed reference – \1, \2, … \99 Named reference – (?P=digits) Example – (\d\d\d)\1

Repetitions Exact – {42} Range – {17,42} – [a-fA-F0-9]{1,8} Open range – {17,}

Repetition shortcuts ? – {0,1} + – {1,} * – {0,}

Repetition variations Non-greedy – *?, +? Possesive – *+, ++, ?+, {1,3}+

Modifiers Case-insensitive – (?i), (?-i) Dot matches line breaks (‘single-line’) – (?s), (?-s) ^ and $ match at line breaks (‘multi-line’) – (?m), (?-m)

Search & Replace \1.. \99 reference to a group \0 all matched text (?P=group) reference to a named group \` left context \' right context

Examples Username – [a-z0-9_-]{3,16} (simplified) – z\.]{2,6}) IP (dotted, v4) – ([0-9]{1,3}\.){3}[0-9]{1,3}

Bad examples File name – (?i)^(?!^(PRN|AUX|CLOCK\$|NUL|CON|COM\d|L PT\d|\..*)(\..+)?$)[^\\\./:\*\?\"<>\|][^\\/:\*\?\" \|]{0,254}$ Parsing HTML with RegEx Catastrophic backtracking – (x+x+)+y

Delphi IDE

Search & Replace Different flavor docwiki.embarcadero.com/RADStudio/XE3/en /Regular_Expressions docwiki.embarcadero.com/RADStudio/XE3/en /Regular_Expressions Groups – {} ?, | are not supported

Search & Replace demo Find {$IFDEF} and {$IFNDEF} – \$IFN*DEF Replace {$IFN?DEF WIN64} with {$IFN?DEF CPUX64} – \{\$IF(N*)DEF WIN64\} – \{$IF\1DEF CPUX64\}

Code Examples

Questions?