Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular Expressions in.NET Ashraya R. Mathur CS 795 -.NET Security.

Similar presentations


Presentation on theme: "Regular Expressions in.NET Ashraya R. Mathur CS 795 -.NET Security."— Presentation transcript:

1 Regular Expressions in.NET Ashraya R. Mathur CS 795 -.NET Security

2 Outline Introduction to Regular Expressions Regular Expression Syntax Validation in ASP.NET Regular Expressions in.NET Programming Demonstrations Conclusion

3 What are Regular Expressions? Definition  “A Regular Expression is a series of characters that are transformed into an algorithm that matches and manipulates text” Allow you to  Extract, edit, replace, or delete text substrings  Add the extracted strings to a collection in order to generate a report Are a universally valuable skill applicable in.NET, Java, Perl, PHP, JavaScript, and many other programming languages

4 Common Regular Expression Uses Form and Data Validation Query-String Validation Data Clean-up / Reformatting Data search and retrieval HTML / XML Information Retrieval Parsing Log Files

5 Regular Expressions Syntax Simple Expressions  Simplest Regular Expression - the literal string Quantifiers  *, which describes "0 or more occurrences“,  +, which describes "1 or more occurrences", and  ?, which describes "0 or 1 occurrence".  Explicit Quantifiers - {x,y}, which allow an exact number or range to be specified  Quantifiers always refer to the pattern immediately preceding (to the left of) the quantifier

6 Regular Expressions Syntax(contd) Metacharacters  include the following: $ ^. [ ( | ) ] and \ . matches a single character  ^ and $ mark the start and end positions of a line of text. Ex: ^a[a-z]b$  () are used to group an expression. Ex: (abc)*  [ ] A class of characters from which the pattern can match one. Ex: [a-z], [A-Z], [0-9]  | indicates an either-or situation Ex: [ab|cd]  \ used as an escape character. Ex: “c:\\”

7 Sample Regular Expressions PatternDescription ^\d{5}$5 numeric digits, US ZIP code. ^(\d{5}(-\d{4})?$Same as previous, but more efficient. Optional US ZIP+4 format ^\w+@[a-z]+?\.[a-z]{2,3}$Simple email validation expression ^\d{3}-\d{2}-\d{4}$Social Security Number Validation ^\d{1,2}\/\d{1,2}\/\d{4}$Date Format Validation ([\w-]+\.)+[\w-]+(/[\w-./?%&=]*)?URL Validation /\*.*\*/Matches the contents of a C-style comment /* … */

8 Validation in ASP.NET RegularExpressionValidator Validation Control Allows you to validate inputs by providing a regular expression which must match the input. The regular expression pattern is specified by setting the ValidationExpression property of the control. Key properties:  ControlToValidate  ErrorMessage (for the ValidationSummary)

9 Regular Expressions in.NET Programming.NET Base Classes  Namespace: System.Text.RegularExpressions Can use from any.NET language Implements the “Traditional NFA RegEX Engine”  As does Java, Perl, PHP etc.. Almost all patterns will work the same .NET is only one to implement “Named Captures”

10 The RegEx Namespace System.Text.RegularExpressions RegEx Match MatchCollection Group GroupCollection Capture CaptureCollection RegExCompilationInfo

11 The Regex class represents a single regular expression It is immutable, which means once you create it, you cannot change it To create a Regex object in C#, you can first define it and then instantiate it with the regular expression pattern, as shown here: The Regex Base Class Regex myRegex; myRegex = new Regex(“RegularExpressionPattern”);

12 The Regex Base Class (Contd) Match: Searches a given string and returns a single Match object for the first text that is matched by the regular expression pattern Matches: Searches a given string and returns a MatchCollection object for all locations that are matched by the pattern stored in the Regex object IsMatch: Returns True if the provided string contains the pattern Split: Splits the given string into an array of substrings using the regular expression pattern as the delimiter Replace: Replaces any instances of text that match the pattern in the Regex object with the provided expression

13 Demonstration #1 private void btnRun_Click(object sender, System.EventArgs e) { //Use the RegEx object to determine if there is a match here we use a //single RegEx object passing in the pattern and option to ignore case Regex rxMatch = new Regex(txtRegEx.Text, RegexOptions.IgnoreCase); //determine if there is a match using the user input bool blnResult=rxMatch.IsMatch(txtText.Text); //display those results to the user MessageBox.Show("The Result is: " + blnResult.ToString(),"RegEx Demo"); }

14 Match and Match Collection Allows us to obtain the details of each match made via a regular expression  Match-represents a single match made  MatchCollection-a collection of Match Objects When the Match method of the Regex object is used, it returns a Match object that contains the matching text The MatchCollection object contains a series of Match objects, each representing a single substring from the string searched

15 Demonstration #2 private void btnRun_Click(object sender, System.EventArgs e) { //Use the RegEx object to determine if there is a match here we use a //single RegEx object passing in the pattern and option to ignore case Regex rxMatch = new Regex(txtRegEx.Text, RegexOptions.IgnoreCase); Match mtMatch; MatchCollection mtCol; mtMatch= rxMatch.Match(txtText.Text); mtCol=rxMatch.Matches(txtText.Text); MessageBox.Show("There are " + mtCol.Count + " matche(s) found.","RegEx Demos");

16 Demonstration #2 (contd) //if there are more than 0 matches, show them if (mtCol.Count>0) { //use the Match object here do { //we want the match.value and position in the string MessageBox.Show("Result at position string " + mtMatch.Index.ToString() +": " + mtMatch.Value.ToString(),"RegEx Demos"); mtMatch=mtMatch.NextMatch(); }while (mtMatch.Success); }

17 Group and GroupCollection Capturing: ()  The captured subsequence may be used later in the expression, via a back reference, and may also be retrieved from the matcher once the match operation is complete Non-Capturing: (?:) Named Capture (.NET only): (? )  Uses names for the captured groups instead of numbers Substitutions  Specialized Replace via groups

18 Backreferences & Advanced Grouping Backreferences  Allows you to match the same characters as a previous group  Match repeated words: (\b[a-zA-Z] + \b)\s\1 Advanced Grouping  Positive Look-Ahead Assertion (?=)  Negative Look-Ahead Assertion (?!)  Positive Look-Behind Assertion (?<=)  Negative Look-Behind Assertion (?<!)  Non-Backtracking (?>)

19 Replacing Substrings The Replace method of Regex is used to replace matched portions of a given string with the specified replacement. Example using backrefrence & named capture: NewDateYMD = Regex.Replace( OldDateMDY, “\b(? \d{1,2})/(? \d{1,2})/(? \d{2,4})\b”, “${year}-${month}-${day}”)

20 Demonstration #3 private void btnCapture_Click(object sender, System.EventArgs e) { //a basic pattern that will capture any word w/ 4 characters string strRegExPattern="([A-Za-z]{4})"; Regex rxGroups= new Regex(strRegExPattern,RegexOptions.IgnoreCase); //Match Object->Using a group here Match mtGroup = rxGroups.Match(txtCapture.Text); //get all of the groups that exist do { MessageBox.Show(mtGroup.Groups[1].Value, "RegEx Demos"); mtGroup=mtGroup.NextMatch(); } while (mtGroup.Success); }

21 Demonstration #3 (contd) private void btnNamedCapture_Click(object sender, System.EventArgs e) { //a basic pattern that will capture any word w/ 4 characters //and the ability to use named capturing string strRegExPattern="(? [A-Za-z]{4})"; Regex rxGroups= new Regex(strRegExPattern,RegexOptions.IgnoreCase); Match mtGroup = rxGroups.Match(txtCapture.Text); do { //show the match using the named reference "word" MessageBox.Show(mtGroup.Result("${word}"), "RegEx Demos"); mtGroup=mtGroup.NextMatch(); } while (mtGroup.Success); }

22 Demonstration #3 (contd) private void btnBack_Click(object sender, System.EventArgs e) { //Use the RegEx object to determine if there is a //duplicate word here using the (\b[a-zA-Z]+\b)\s\1 pattern Regex rxMatch = new Regex(txtRegEx.Text,RegexOptions.IgnoreCase); //string to replace the text into //replace the repeated word /w nothing: $1 string strReplace=rxMatch.Replace(txtBack.Text,"$1"); //show the results MessageBox.Show(strReplace,"RegEx Demos"); }

23 References Regular Expression Library http://regexlib.com/ http://regexlib.com/ Regular Expressions Information Website http://www.regular-expressions.info/dotnet.html http://www.regular-expressions.info/dotnet.html Regular Expressions in.NET – MSDN Library Professional Visual Studio® 2005 -Andrew Parsons and Nick Randolph

24 Questions?


Download ppt "Regular Expressions in.NET Ashraya R. Mathur CS 795 -.NET Security."

Similar presentations


Ads by Google