Presentation is loading. Please wait.

Presentation is loading. Please wait.

Python Regular Expressions Easy text processing. Regular Expression  A way of identifying certain String patterns  Formally, a RE is:  a letter or.

Similar presentations


Presentation on theme: "Python Regular Expressions Easy text processing. Regular Expression  A way of identifying certain String patterns  Formally, a RE is:  a letter or."— Presentation transcript:

1 Python Regular Expressions Easy text processing

2 Regular Expression  A way of identifying certain String patterns  Formally, a RE is:  a letter or lambda  RE1 RE2 (concatenate 2 RE’s)  (RE or RE)  (RE)*  Why do you think they’re called Regular Expressions?

3 Python regex  Use the re module  import re  The special characters:. ^ $ * + ? { } [ ] \ | ( )  We’ll learn them one at a time…

4 Character classes  [abc] means a or b or c  [a-c] is the same thing  [a-z] = any lowercase letter  [^579] = any character except 5, 7, or 9 For Strings, use |: Shannon|Duvall

5 Metacharacters  \d any digit [0-9]  \D any non-digit [^0-9]  \s any whitespace character (tabs, return so forth)  \S  \w any alphanumeric character  \W  \b any word boundary . anything except newline

6 Repeat  * means 0 or more ma*d matches: md, mad, and maaaaad  + means 1 or more ma+d matches mad and maaaaad but not md  ? means 0 or 1 ma?d matches md and mad only  {x,y} means between x and y repetitions ma{1,3}d matches mad, maad, and maaad

7 Repeating groups  [ab]* matches a, b, bbb  (ab)* matches ab, abab, ababab

8 More metacharacters  ^ outside of a character class, means the beginning of a line  $ matches the end of a line

9 What can I do with them? Search  re.search(pattern, string, )  pattern is the regex  string is what you are searching in  flags are special modifiers, optional  This either returns None (false) or a Match object  When specifying the regex, use r to denote “raw string”

10 Search Example import re line = “Cats are smarter than dogs” if re.search(r’.*are.*than.*’,line): print(“yes”)

11 Groups  Using () in a regex creates a group that can be referenced later.  The string that matches the entire regex is said to be group 0.  Other groups are numbered, starting at 1.

12 Grouping example import re m = re.search(r'(\w+) (\w+)',"Shannon Lynn Duvall") m.group(0) 'Shannon Lynn’ m.group(1) 'Shannon’ m.group(2) 'Lynn'

13 Grouping Example  Would it match? m = re.search(r’(\w+) \1’, “Shannon Shannon”)  Space taken out: m = re.search(r’(\w+)\1’, “Shannon Shannon”)

14 Nested groups  Group number goes from out to in. Count the parentheses. m = re.search(r'(a(b)c)d’, ’’abcd’’) m.group(0) 'abcd’ m.group(1) 'abc’ m.group(2) 'b'

15 sub: search and replace  re.sub(regex, putIn, string, )  phone = "1-800-555-9090”  newPhone = re.sub(r'\D', “”, phone)  What is newPhone?

16 findall  Search for all matches and return them as a list  song ="12 drummers drumming, 11 pipers piping, 10 lords a leaping"  nums = re.findall(r'\d+',song)  nums is now [‘12’, ‘11’, ‘10’]

17 split  Split a string based on a regex as the delimiters. verses = re.split(r'\d+',song) verses is ['', ' drummers drumming, ', ' pipers piping, ', ' lords a leaping']

18 split with groups  Sometimes you want the delimiter to show up in the list. Use a group – the group will be returned in the list. verses = re.split(r'(\d+)',song) verses is: ['', '12', ' drummers drumming, ', '11', ' pipers piping, ', '10', ' lords a leaping']

19 Examples  You have a string that represents a poker hand:  a,k,q,j for ace, king, queen, jack  1-9 for numbers 1-9  0 for 10

20 How would you:  Make sure a string is a valid hand?  Check for a pair of sevens?  Check for any pair?  Check for 3 of a kind?  Check for a full house?


Download ppt "Python Regular Expressions Easy text processing. Regular Expression  A way of identifying certain String patterns  Formally, a RE is:  a letter or."

Similar presentations


Ads by Google