Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular Expressions 'RegEx'.

Similar presentations


Presentation on theme: "Regular Expressions 'RegEx'."— Presentation transcript:

1 Regular Expressions 'RegEx'

2 Learning Objectives By the end of this lecture, you should be able to:
Define what is meant by a regular expressions Be able to create regular expression statements Compare regular expressions with strings and decide if they match

3 Finding patterns in strings
We have already discussed how to find a specific substring inside a string. For example, you could search a string containing a list of names for the substring 'Lisa'. But suppose that instead of searching for a specific substring, we instead wanted to find a pattern of characters? For example, suppose we wanted to check a zip code to make sure that it contained 5 and only 5 characters, and to ensure that all of them were digits (i.e. no letters)? This is where an invaluable programming tool called 'regular expressions' can be applied. Other examples of things you can check for using regular expressions include: Checking a phone number to make sure it is in the form 3 numbers, a dash, three more numbers, another dash, and then 4 numbers Checking to make sure that an address has at least 1 character followed by an sign, followed by at least one dot (period) character Checking a date to make sure it is entered in the form two numbers, followed by a '/' followed by two more numbers, followed by another '/' followed by 4 numbers. In fact, there is much, much more power and flexibility in regular expressions than is described by the above examples.

4 Creating a regular expression
Begin by creating a "regular expression literal". This is a pattern of regex characters placed inside forward slashes. The forward slashes indicate a regular expression, in the same way that quotation marks indicate a string. In JavaScript, we use a method called search() to attempt to match our "regex literal" with a string. We invoke the function using the string, and place our regex literal as the argument to the function search(). some_string.search(regex_literal); Example: "have a nice day".search(/ice/); The search()function will return the index of where the expression was found. If the expression is not found, then the search()function will return a -1. In the above example, the search()function will return 8 since 'ice' was found at index 8 in the string.

5 Pop-Quiz What will be stored in found_position in each of the following? var found_position; var quote = "To be or not to be, that is the question."; found_position = quote.search(/To be/); Answer: 0 found_position = quote.search(/to be/); Answer: 13 found_position = quote.search(/To Be/); Answer: -1 found_position = quote.search(/be/); Answer: 3 (the function returns the index of the first occurrence of a match)

6 search() versus indexOf()
In the previous example, we used the search() function to match. We observed that search() returns an integer corresponding to the location of the match. If no match is found, then search() returns -1. Doesn't this sound exactly like indexOf() that we use with strings? In fact, they are indeed similar. However, whereas indexOf() only allows us to search for specific text , the search()function is far more powerful as it will also allow us to search for regular expression patterns.

7 Common Pattern-Matching Symbols
In an earlier regular expression, our literal was: /to be/ However, we could have simply used indexOf() to do this and not gone to the trouble of using a regular expression. The true power of regex lies in our ability to match patterns. To do so, we need a sort of 'code' in the form of a combination of letters and symbols characters that allow us to define a pattern. Here is a table showing some of the most common pattern-matching characters:

8 Common Pattern-Matching Symbols
Regular expressions will match one character at a time. Example: Suppose you wanted to make sure that the first character in a certain string was a 'Q'. To do so, you would need the put the character 'Q' in your regex literal. However, that wouldn't be enough. Since our goal is to make sure that the 'Q' is the first character in our string, we would also need the '$' character. So in this case, our literal would be: /^Q/ As always, the literal goes inside the forward slash characters The caret sign says that whatever comes next must be the first item in the string. Suppose we now wanted to make sure that the first character in our string was a digit (i.e. 0-9). in this case, our literal would be: /^\d/ The \d will match any character that is a digit.

9 What value would be stored in found_position in the following examples?
var found_position; var url = " found_position = url.search(/^www/); Answer: 0 found_position = url.search(/www$/); Answer: -1  This would only match if 'www' was at the end of the string. found_position = url.search(/edu$/); Answer: 11 found_position = url.search(/$edu/); Answer: -1 (the $ must appear after the 'edu'). found_position = url.search(/\d/); Answer: -1 (no digit is present anywhere inside the string) found_position = url.search(/\W/); Answer: 3  \W matches anything that is NOT a letter, digit, or underscore.

10 Example: 5-digit zip code
Now let's check a string to see if it matches a 5-digit U.S. zip code. We might begin by checking the string to see if it at least matches one digit (i.e. one number). From our chart we can see that \d matches any digit 0 through 9. Of course, we want to match five numbers. However, this is easily solved by simply repeating the \d five times: \d\d\d\d\d So: var zipCode = "60614"; var regExZIP = /\d\d\d\d\d/; var result = zipCode.search(regExZIP); //result will hold the value 0

11 Example: 5-digit zip code contd
Still there is a problem – can you see it? While this expression would indeed match 60614, it would also match " " or "abcdefg60614hijklmn" and so on. What we need is a way to indicate that the should be the beginning and end of our string. In this case, we want to have 5 characters. This can be enforced by the presence of exactly 5 characters in our regex literal. We can specify any and all characters by a period. By placing 5 periods in our literal, we will only match strings that have a minimum 5 characters in them. So in this case: /...../ We then stipulate that it must be exactly 5 chracters. A clever way to do this might be by stipulating that the string must begin with 5 characters, and also end with 5 characters. all five of these characters mus be digits: /^.....$/ We now stipulate that all of those characters must be digits: /^\d\d\d\d\d$/

12 We've only scratched the surface
Because working with strings is such a major part of today's computing and data-science world, regular expressions have evolved into a somewhat detailed topic. We have only scratched the surface. There are countless additional techniques, shortcuts, and levels of complexity that can still be explored and entire books have been written on the topic. For example, there are shortcuts you can use. In the zip code example, rather than writing out \d five times, you could say: \d{5} which accomplishes the same thing: /^\d{5}$/ For example, you might want to expand on the previous example with a regular expression that will accept either the typical 5-digit zip code, or a zip code in the format of #####-#### that is also widely used. Because they are so widely used, there are many pre-written regular expressions that you can easily find online and use or modify for your needs.

13 Example zip_code_checker.htm


Download ppt "Regular Expressions 'RegEx'."

Similar presentations


Ads by Google