Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 DIG 3134 – Lecture 10: Regular Expressions and preg_match in PHP and Validating Inputs Michael Moshell University of Central Florida Internet Software.

Similar presentations


Presentation on theme: "1 DIG 3134 – Lecture 10: Regular Expressions and preg_match in PHP and Validating Inputs Michael Moshell University of Central Florida Internet Software."— Presentation transcript:

1 1 DIG 3134 – Lecture 10: Regular Expressions and preg_match in PHP and Validating Inputs Michael Moshell University of Central Florida Internet Software Design

2 2 Part 1: Regular Expressions A "grammar" for validating input useful for many kinds of pattern recognition The basic built-in Boolean function is called 'preg_match'. It takes two or three arguments: the pattern, like "cat" the test string, like "catastrophe" and an (optional) array variable, which we can ignore for now It returns TRUE if the pattern matches the test string.

3 3 Regular Expressions Note; There is also a perl based regular expression system in PHP called ereg. I used it for years. But it is now DEPRECATED** So we will use preg_match in this course from now on. ** BONUS for those who attended today's lecture: Definition of DEPRECATED.

4 4 Regular Expressions $instring = "catastrophe"; if (preg_match("/cat/",$instring)) { print "I found a cat!"; } else { print "No cat here."; }

5 5 Regular Expressions $instring = "catastrophe"; if (preg_match("/cat/",$instring)) { print "I found a cat!"; } else { print "No cat here."; } I found a cat!

6 6 Regular Expressions Wild cards:period. matches any character $instring = "cotastrophe"; if (preg_match("/c.t/",$instring)) { print "I found a c.t!"; } else { print "No c.t here."; }

7 7 Regular Expressions Wild cards:period. matches any character $instring = "cotastrophe"; if (preg_match("/c.t/",$instring)) { print "I found a c.t!"; } else { print "No c.t here."; } I found a c.t!

8 8 Regular Expressions Wild cards:.* matches any string of characters (or the "null character"!) $instring = "cotastrophe"; if (preg_match("/c.*t/",$instring)) { print "I found a c.*t!"; } else { print "No c*t here."; }

9 9 Regular Expressions Wild cards:.* matches any string of characters (or the "null character"!) $instring = "cotastrophe"; if (preg_match("/c.*t/",$instring)) { print "I found a c.*t!"; } else { print "No c*t here."; } I found a c.*t!

10 10 Regular Expressions Wild cards:* matches any string of characters (or the "null character"!) $instring = "cflippingmonstroustastrophe"; if (preg_match("/c.*t/",$instring)) { print "I found a c.*t!"; } else { print "No c.*t here."; }

11 11 Regular Expressions Wild cards:* matches any string of characters (or the "null character"!) $instring = "cflippingmonstroustastrophe"; if (preg_match("/c*t/",$instring)) { print "I found a c.*t!"; } else { print "No c.*t here."; } I found a c.*t!

12 12 Quantification Multiple copies of something: a+ means ONE OR MORE a’s a* means ZERO OR MORE a’s a? means ZERO OR ONE a a{33} means 33 instances of a

13 13 Escaping Backslash means "don't interpret this:" \. is just a dot \* is just an asterisk.

14 14 The concept: Would $t="/a{3}\.b{1,4}/"; $s= "aaa.bbb"; this would or would not be accepted? preg_match($t,$s) – true or false?

15 15 The concept: Would $t="/a{3}\.b{1,4}/"; $s= "aaa.bbb"; this would or would not be accepted? preg_match($t,$s) – true or false? TRUE, because $s matches the pattern string $t. three a, one dot, and between one and four b characters.

16 16 The concept: Would $t="/a{3}\.b{1,4}/"; $s= "aaa.bbbbb"; this would or would not be accepted? preg_match($t,$s) – true or false?

17 17 The concept: Would $t="a{3}\.b{1,4} "; $s= "aaa.bbbbb"; this would or would not be accepted? preg_match($t,$s) – true or false? TRUE, because there ARE indeed 3 a, a dot, 4 b. (Yes, there are MORE than 4b, but so what?) As soon as a match is found, the preg_match returns TRUE without examining the rest of the string $s.

18 18 Grouping Multiple copies of something: (abc)+ means ONE OR MORE string abc’s (abc)* means ZERO OR MORE string abc’s like abcabcabc SETS: [0-9] matches any single integer character [A-Z] matches any uppercase letter [AZ] matches A or Z [AZ]? matches null, A or Z a{3}\.b{1,4} matches 3 a’s, a dot, and one to 4 b’s.

19 19 Starting and Ending preg_match("/cat/","abunchofcats") is TRUE but preg_match("/^cat/","abunchofcats") is FALSE because ^ means "must match the first letter. preg_match("/cats$/","abunchofcats") is TRUE but preg_match("/cats$/","mycatsarelazy") is FALSE So, ^ marks the head and $ marks the tail, for preg_match.

20 20 Sets - Examples [A-E]{3} matches AAA, ABA, ADD,... EEE [PQX]{2,4} matches PP, PQ, PX... up to XXXX

21 21 Practice Write a RE that recognizes any string that begins with "sale". Here's an example for you to look at, help you remember preg_match("/^cat/",$teststring)

22 22 Practice 1) Write a RE that recognizes any string that begins with "sale". Answer: preg_match("/^sale/",$teststring)

23 23 Practice 1) Write a RE that recognizes any string that begins with "sale". Answer: preg_match("^sale",$teststring) 2) Write a RE that recognizes a string that begins with "smith" and a two digit integer, like smith23 or smith99. Here's an example from your recent past: a{3}\.b{1,4}

24 24 Practice 1) Write a RE that recognizes any string that begins with "sale". Answer: preg_match("/^sale/",$teststring) 2) Write a RE that recognizes a string that begins with "smith" and a two digit integer, like smith23 or smith99. Answer: preg_match("/^smith[0-9]{2}/",$teststring)

25 25 Sucking Data from a String Now it gets REALLY useful. Let's say we have an input like smith23 or jones99; could be ANY name and ANY number. we want to split it into the name and the number. Here's how preg_match("/^[a-zA-Z]+[0-9]+/",$reg) – this would recognize the pattern, right? (Including roBert, etc.) if ( preg_match("/ (^[a-zA-Z]+) ([0-9]+)/",$in, $reg) ) { $name=$reg[1]; $number=$reg[2];.. // etc $wholething=$reg[0]; }

26 26 Sucking Data from a String Now it gets REALLY useful. Let's say we have an input like smith23 or jones99; could be ANY name and ANY number. we want to split it into the name and the number. Here's how preg_match("/^[a-zA-Z]+[0-9]+/",$reg) – this would recognize the pattern, right? (Including roBert, etc.) if ( preg_match("/ (^[a-zA-Z]+) ([0-9]+)/",$in, $reg) ) { $name=$reg[1]; $number=$reg[2];.. // etc } Array

27 27 Practice Problems 1) Write a function that - prints "Good zip code" if the input is of the form 77889 (five integers); - prints "Great zip code" if the input is of the form 33445-9999 (five integers, a dash and four integers) - prints "Is that foreign?" for all other possible values.

28 28 1) Write a function that - prints "Good zip code" if the input is of the form 77889 (five integers); - prints "Great zip code" if the input is of the form 33445-9999 (five integers, a dash and four integers) - prints "Is that foreign?" for all other possible values. function ziptest($input) { if (preg_match("/^[0-9]{5}\-[0-9]{4}$/",$input)) print "Great zip code"; else if (preg_match("/^[0-9]{5}$/",$input)) print "Good zip code"; else print "Is that foreign?"; } #end ziptest

29 29 2) Write a function that returns the five digit part of any valid zipcode presented, as described above; and returns "invalid" if no zipcode was found in either of the two legal formats.

30 30 2) Write a function that returns the five digit part of any valid zipcode presented, as described above; and returns "invalid" if no zipcode was found in either of the two legal formats. function zipchopper($input) { if (preg_match("/^([0-9]{5})\-([0-9]{4})$/",$input,$reg)) return $reg[1]; else if (preg_match("/^([0-9]{5})$/",$input,$reg)) return $reg[1]; else return "invalid"; } #end zipchopper print zipchopper("32766-0041"); // this returns 32766 print zipchopper("32766"); // this returns 32766 print zipchopper("32766-"); // this returns "invalid" print zipchopper("3276"); // this returns "invalid" print zipchopper("snortwoggle"); // this returns "invalid"

31 31 Here's a MIDTERM EXAM PRACTICE PROBLEM 2) Write a function that returns the five digit part of any valid zipcode presented, as described above. If we refer to the input as X, the function returns and returns "X is invalid" if no zipcode was found in either of the two legal formats. If a valid input of type 1 is found (the 5 plus 4) then the function returns "X is type 1 valid". If a valid input of type 2 is found (just 5 digits) then the function returns "X is type 2 valid". print zipchopper("32766-0041"); // this returns "32766-0041 is type 1 valid". print zipchopper("32766"); // this returns 32766 is type 2 valid. print zipchopper("32766-"); // this returns "32766- is invalid" print zipchopper("3276"); // this returns "3276 is invalid" print zipchopper("snortwoggle"); // this returns "snortwoggle is invalid"

32 32 Part 2: Validation Validation means checking to see if inputs are correct and providing corrective feedback if they are not. 1.MISSING: Required inputs that are not provided 2.OUT OF RANGE: out of specified upper/lower limits 3.FORMAT: Input data is not formatted correctly 4.PATTERN: Input doesn't match a secret pattern 5.MATCH LIST: Input doesn't match a stored list 6.INCONSISTENT: Inputs don't fit together. And -- here are many more ways to be wrong... lpmpjogja.diknas.go.id

33 33 Missing Data $firstname=$_POST['firstname']; if (!$firstname) { $report.=" Please provide your first name. "; } // do the same for last name, all other required inputs. if ($report) { print "Please fix these errors: $report "; exit; // ends execution here. (Should print etc); } else { // process the inputs normally } psychologytoday.com

34 34 Out of range $age=$_POST['age']; if ($age<0) // anything here acts like TRUE. { // an empty variable acts like FALSE. $response.=" Are you really $age years old? "; } // do the same for $age>100, and any other inputs. if ($response) { print "Please fix these errors: $response ";} else { // process the inputs normally } www.vrazvedka.ru

35 35 Wrong Format Social Security Numbers: 333-44-6789 if (!preg_match("/^[0-9]{3}\-[0-9]{2}\-[0-9]{4}/",$tryssn)) { do your error thang...} But Sometimes you can tolerate bad formats. but if it's got nine digits, maybe it's right. $cleanssn=preg_replace('/[^0-9]/','',$tryssn); www.pagetutor.com Any char not in [0-9] is replaced by an empty string (that is, two quotes with no space.)

36 36 A brief commercial interruption $output=preg_replace($pattern,$replacement,$input); -- is a MOST AWESOME FUNCTION (read all about it…. It can do magic if you use arrays For the $pattern or the $replacement or both.) www.dreamstime.com

37 37 Wrong Format Social Security Numbers: 333-44-6789 if (!preg_match("^[0-9]{3}\-[0-9]{2}\-[0-9]{4}",$tryssn)) { do your error thang...} But Sometimes you can tolerate bad formats. but if it's got nine digits, maybe it's right. $cleanssn=preg_replace('/[^0-9]/','',$tryssn); UNfortunately, ^ inside the [ ] means NOT, but ^outside the [ ] means 'at the beginning. Any char not in [0-9] is replaced by an empty string (that is, two quotes with no space.) www.pagetutor.com

38 38 Wrong Format Social Security Numbers: 333-44-6789 if (!preg_match("^[0-9]{3}\-[0-9]{2}\-[0-9]{4}",$tryssn)) { // do yo error thang...} But Sometimes you can tolerate bad formats. but if it's got nine digits, maybe it's right. $cleanssn=preg_replace('/[^0-9]/','',$tryssn); if (strlen($cleanssn)!=9) { // do yo error thing again... } Any char not in [0-9] is replaced by an empty string (that is, two quotes with no space.) www.pagetutor.com

39 39 Pattern Enter your promotion code: $testcode=$_POST['testcode']; if (!preg_match("/^Woody[0-9]{4}/",$testcode)) { $reply="Sorry, that code was not accepted."; } else { //do you acceptance thing.. give them the discount. } www.bannersplus.com

40 40 Match List – e. g. a raffle This is just like the password problem. VERSION 1: short lists $try=$_POST['tryinput']; $good['Samclub']=100; $good['Samfriend']=50; $good['Samwitch']=25; $prize=($good[$try]); if ($prize) { $result="Congrats! You have won $$prize."; } else { // do whatever you want with this loser } www.inspectamerica.com

41 41 Match List – e. g. a raffle Version 2: Long (and changing) lists. Use a DB! $try=$_POST['tryinput']; $q="SELECT * FROM checklist WHERE checkval='$try' "; $result=mysql_query($q,$connection); if ($result) { $row=mysql_fetch_array($result) $prize=$row[1]; $result="Congrats! You have won $$prize."; } else { // do whatever you want with this loser } www.inspectamerica.com

42 42 >> MAJOR WARNING << This example contains a NO – NO. $try=$_POST['tryinput']; $q="SELECT * FROM checklist WHERE checkval='$try' "; $result=mysql_query($q,$connection); NOTE: DO NOT TRY THIS AT HOME ** because it can be the victim of an SQL INJECTION ATTACK (see Security Lecture) www.chiomega.com

43 43 >> MAJOR WARNING << If you gotta try it sooner, here's the prophylactic: $try=$_POST['tryinput']; $try=mysql_real_escape_string($try); $q="SELECT * FROM checklist WHERE checkval='$try' "; $result=mysql_query($q,$connection); This will protect your code from an SQL INJECTION ATTACK Web.ics.purdue.edu

44 44 Consistency I am a member of the Meatball Fraternity Member Number meattruck.com $checkin=$_POST['checkin']; $memnum=$_POST['memnum']; if ($checkin && !$memnum) {$report.='Please provide your member number.'; } if (!$checkin && $memnum) {$report.='You did not check the membership box.'; } if ($checkin && $memnum) $ismember=1; if ($report) { // do the error thing }

45 45 A basic decision about validation: OPTION 1: One error ends scan vs OPTION 2: Report all the errors on the current page To do the 'all errors' technique, accumulate your messages in a $results variable, then print the messages all at once. Where we are not going in this course: "Instant validation" – Javascript and AJAX. It's spiffy but it is hard to maintain and extend. Javascript - does all the work in the client. AJAX – the Javascript gets help from the server.

46 46 What should you do about validation? *** Any input on Project 2, 3 or 4 that can be validated, SHOULD BE. (Brag about it, so I'll test it!!) *** There will be questions about validation on the midterm exam. SO... practice with Regular Expressions and TEST them using WAMP or MAMP and DEBUG them if they don't work.


Download ppt "1 DIG 3134 – Lecture 10: Regular Expressions and preg_match in PHP and Validating Inputs Michael Moshell University of Central Florida Internet Software."

Similar presentations


Ads by Google