CS346 Regular Expressions1 Pattern Matching Regular Expression
CS346 Regular Expressions2 Pattern Matching JavaScript provides two ways to do pattern matching: 1. Using RegExp objects 2. Using methods on String objects RE in both ways are the same Same as in Perl
CS346 Regular Expressions3 Simple patterns Two categories of characters in patterns: a. normal characters (match themselves) b. metacharacters (can have special meanings in patterns--do not match themselves) \ | ( ) [ ] { } ^ $ * + ?. - A metacharacter is treated as a normal character if it is backslashed - period (.) is a special metacharacter - it matches any character except newline
CS346 Regular Expressions4 create RegExp objects var varname = / reg_ex_pattern / flags Simplest example: exact match To match occurrence of “our” in a string containing your, our, sour, four, pour var toMatch = /our/;
CS346 Regular Expressions5 1. Matching in RegExp objects Tests a string for pattern matches. This method returns a Boolean that indicates whether or not the specified pattern exists within the searched string. This is the most commonly used method for validation. Use test() method of RegExp object Format: regexp.test( string_to_be_tested ) test() returns a Boolean var tomatch=/our/; var result = tomatch.test(“pour”); //boolean result Example: 16-0-checkName.html
CS346 Regular Expressions6 Pattern Modifiers (Adding flags) Flag(s)Purpose iMakes the match case insensitive /oak/i matches "OAK" and "Oak" gPerforms a global match not just the first igMakes the match case insensitive and global
CS346 Regular Expressions7 2. Matching in Strings search() method Returns the position in the specified string of the RE pattern (position is relative to zero); returns -1 if it fails var str = "Gluckenheimer"; var position = str.search(/n/); /* position is now 6 */ match() method compares a RE and a string to see whether they match. replace() method finds out if a RE matches a string and then replaces a matched string with a new string
CS346 Regular Expressions8 search() method Format: string.search(reg-exp) Searches the string for the first match to the given regular expression returns an integer that indicates the position in the string (zero-indexed). If no match is found, the method will return –1. Similar to the indexOf() method, Example: To find the location of the first absolute link within a HTML document:: pos = htmlString.search(/^<a href = ” if ( pos != -1) { alert( ‘First absolute link found at’ + pos +’position.’); } else { alert ( ‘Absolute links not found’); }
CS346 Regular Expressions9 Match() method match() method Format: string.match( regular_expression ) returns an array of all the matching strings found in the string given. If no matches are found, then match() returns false. Example: To check the proper format for a phone number entered by a user, with the form of (XXX) XXX-XXXX. function checkPhone( phone ) { phoneRegex = /^\(\d\d\d\) \d\d\d-\d\d\d\d$/; if( !phone.match( phoneRegex ) ) { alert( ‘Please enter a valid phone number’ ); return false; } return true; }
CS346 Regular Expressions10 replace() method Format string.replace(reg_exp) Properties: replaces matches to a given regular expression with some new string. Example: To replace every newline character (\n) with a break tag, comment = document.forms[0].comments.value; /* assumes that the HTML form is the first one present in the document, and it has a field named “comments” */ comment = comment.replace( /\n/g, “ ”); function formatField( fieldValue ) { return fieldValue = fieldValue. replace(/\n/g, “ ”); } The function accepts any string as a parameter, and returns the new string with all of the newline characters replaced by tags.
CS346 Regular Expressions11 Character classes – [ ] Sequence of characters in brackets defines a set of characters, any one of which matches e.g. [abcd] Dashes used to specify spans of characters in a class e.g. [a-z] A caret at the left end of a class definition means the opposite e.g. [^0-9]
CS346 Regular Expressions12 Character class abbreviations AbbreviationEquiv. PatternMatches \d[0-9]a digit \D[^0-9]not a digit \w[A-Za-z_0-9]a word char. \W[^A-Za-z_0-9]not a word char. \s[ \r\t\n\f]a whitespace char. \S [^ \r\t\n\f] not a whitespace char.
CS346 Regular Expressions13 From Chapter 25 of text - Perl Note the difference of usage of ^ here and in a class
CS346 Regular Expressions14 Quantifiers Quantifiers in braces - Repetitions QuantifierMeaning {n}exactly n repetitions {m,}at least m repetitions {min, max}at least min but max repetitions allowed
CS346 Regular Expressions15 Some other common Quantifiers *zero or more repetitions e.g., \d* means zero or more digits + one or more repetitions e.g., \d+ means one or more digits ? zero or one e.g., \d? means zero or one digit. exactly one character except newline character e.g., /.l/ matches al but not \n nor l
CS346 Regular Expressions16 Anchors The pattern can be forced to match only at the left end with ^; at the end with $ e.g., /^Lee/ matches "Lee Ann" but not "Mary Lee Ann" /Lee Ann$/ matches "Mary Lee Ann", but not "Mary Lee Ann is nice“ The anchor operators (^ and $) do not match characters in the string--they match positions, at the beginning or end
CS346 Regular Expressions17 Examples test() See 16-1checkURL.html See 16-2valid .html search() method in String See 16-3check_phone.html
CS346 Regular Expressions18 replace method() replace(RE_pattern, string) Finds a substring that matches the pattern replaces it with the string g modifier applicable var str = "Some rabbits are rabid"; str.replace(/rab/g, "tim"); str is now "Some timbits are timid“ Matched substrings stored in $1, $2, etc $1 and $2 are both set to "rab"
CS346 Regular Expressions19 match(pattern) Most general pattern-matching method Returns an array of results of the pattern-matching operation With the g modifier, returns an array of the substrings that matched Without the g modifier, first element of the returned array has the matched substring, the other elements have the values of $1, … obtained by parenthesized parts of pattern var str = "My 3 kings beat your 2 aces"; var matches = str.match(/[ab]/g); - matches is set to ["b", "a", "a"]
CS346 Regular Expressions20 match(pattern) example 16-4matchExample.html var str = “Having a take-home exam that takes 3 hours to complete is better than a 1-hour in-class exam”; var matches = str.match( /\d/g ); matches is set to [3, 1]
CS346 Regular Expressions21 Parentheses in RE Example: 16-5complexMatchEx.html var str = "I have 118 credits; but I need 120 to graduate"; matches = str.match(/(\d+)([^\d]+)(\d+)/); document.write(matches, " "); 1 st element of matches is the match, 2 nd is the value of $1, 3 rd element $2, 4 th element $3 etc. matches array: 118 credits; but I need 120,118, credits; but I need,120 ______________________ ___ _______________ ___ match with RE $1 $2 $3
CS346 Regular Expressions22 Alternate patterns Use the alternation operator | Example: 16-6matchAlternatives.html
CS346 Regular Expressions23 split(parameter) of String splits a string into substrings based on a pattern “:" and /:/ both work Example: 16-7splitEx.html
CS346 Regular Expressions24 Program Structure Example 16-3check_phone.html Limitations? How can you make it more flexible? Can you generalize it for checking multiple fields
CS346 Regular Expressions25 Uniform Program Structure for multiple tests regex_name.test( string_to_be_tested ) to test each field if test() returns false, compile an error message See 16-8Structure.html
Examples of curly braces { } 16-9-curly_braces.html CS346 Regular Expressions26
CS346 Regular Expressions27 Table – Regular Expression Codes See “Regular Expression Codes.doc”