Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fall 2006 CSE 467/567 1 RE review (Perl syntax) single-character disjunction: [aeiou] ranges: [0-9] negation: [^aeiou] conjunction: /cat/ matching zero.

Similar presentations


Presentation on theme: "Fall 2006 CSE 467/567 1 RE review (Perl syntax) single-character disjunction: [aeiou] ranges: [0-9] negation: [^aeiou] conjunction: /cat/ matching zero."— Presentation transcript:

1 Fall 2006 CSE 467/567 1 RE review (Perl syntax) single-character disjunction: [aeiou] ranges: [0-9] negation: [^aeiou] conjunction: /cat/ matching zero or one: /cats?/ Kleene * and +: /[ab]+/ matches ‘a’, ‘b’, ‘aa’, ‘ab’, ‘ba’, ‘bb’, etc wildcard: /c.t/ matches “cat”, “cbt”, “cct”, … anchors: ^, $, \b, \B /projects/CSE467/Resources/Code/Perl

2 Fall 2006 CSE 467/567 2 Conjunction Two regular expressions are conjoined by juxtaposition (placing the expressions side by side). Examples: /a/ matches ‘a’ /m/ matches ‘m’ /am/ matches ‘am’ but not ‘a’ or ‘m’ alone

3 Fall 2006 CSE 467/567 3 Disjunction We have already seen disjunction of characters using the square bracket notation General disjunction is expressed using the vertical bar (|), also called the pipe symbol. This form of disjunction allows us to match any one of the alternative patterns, not just characters like the [ ] disjunction form.

4 Fall 2006 CSE 467/567 4 Grouping Parentheses, ‘(’ and ‘)’, are used to group subpatterns of a larger pattern. Ex: /[Gg](ee)|(oo)se/

5 Fall 2006 CSE 467/567 5 Replacement In addition to matching, we can do replacements when a match is found: Example: To replace the British spelling of color with the American spelling, we can write: s/colour/color/

6 Fall 2006 CSE 467/567 6 Registers – saving matches To save a match from part of a pattern, to reuse it later on, Perl provides registers Registers are named \#, where # is the number of the register Ex. DE DO DO DO DE DA DA DA IS ALL I WANT TO SAY TO YOU /(D[AEO].)*/ will match the first line /(D[AEO])(.D[AEO]) \2 \2\s \1 (.D[AEO]) \3 \3/ matches it more specifically This pattern also matches strings like DA DE DE DE DA DO DO DO \s matches a whitespace character

7 Fall 2006 CSE 467/567 7 For more information PERL Regular Expression TUTorial –http://perldoc.perl.org/perlretut.htmlhttp://perldoc.perl.org/perlretut.html PERL Regular Expression reference page –http://perldoc.perl.org/perlre.htmlhttp://perldoc.perl.org/perlre.html

8 Fall 2006 CSE 467/567 8 Eliza Published by Weizenbaum in 1966 Modelled a Rogerian therapist Had no intelligence – worked by pattern matching and replacement Had some people convinced that it really understood! demo at http://chayden.net/eliza/Eliza.shtmlhttp://chayden.net/eliza/Eliza.shtml

9 Fall 2006 CSE 467/567 9 Wordcount program Unix wordcount program (wc) counts lines, words and characters Determining counts & probabilities of words has many applications: –augmentative communiction –context-sensitive spelling error correction –speech recognition –hand-writing recognition

10 Fall 2006 CSE 467/567 10 Counting words in a corpora (preview) #!/usr/bin/perl #FROM Perl BOOK, PAGE 39 $/ = ""; # Enable paragraph mode. $* = 1; # ENABLE multi-line patterns. # Now read each paragraph and split into words. Record each # instance of a word in the %wordcount associative array. $total = 0; while (<>) { s/-\n//g; # Dehyphenate hyphenations (across lines) s/ //g; # Remove tr/A-Z/a-z/; # Canonicalize to lowercase. @words = split(/\W*\s+\W*/, $_); foreach $word (@words) { $wordcount{$word}++; # Increment the entry. $total++; } } # Now print out all the entries in the %wordcount array foreach $word (sort keys(%wordcount)) { printf "(%8.6f\%) %20s occurs %3d time(s)\n", (100 * $wordcount{$word}/$total), $word, $wordcount{$word}; } printf "Total number of distinct words is %d.\n", $total;


Download ppt "Fall 2006 CSE 467/567 1 RE review (Perl syntax) single-character disjunction: [aeiou] ranges: [0-9] negation: [^aeiou] conjunction: /cat/ matching zero."

Similar presentations


Ads by Google