Fall 2007CS 2251 Miscellaneous Topics Cloning Patterns Recursion and Grammars
Fall 2007CS 2252 The Shallow Copy Problem
Fall 2007CS 2253 Cloning The purpose of cloning in object-oriented programming is analogous to cloning in biology –Create an independent copy of an object Initially, both objects will store the same information Since they are different objects, you can change one object without affecting the other
Fall 2007CS 2254 The statement e1.setAddressLine1("Room 224"); creates a new String object that is referenced by e1.address.line1 and e2.address.line1
Fall 2007CS 2255 The Object.clone method Java provides the Object.clone method to help solve the shallow copy problem The initial copy is a shallow copy as the current object’s data fields are copied To make a deep copy, you must create cloned copies of all components by invoking their respective clone methods
Fall 2007CS 2256 Cloning After e1.setAddressLine1("Room 224"); only e1.address.line1 references the new String object.
Fall 2007CS 2257 Regular Expressions
8 First described by Stephen Kleene Used for pattern matching –Unix utilities like grep and awk –built into many scripting languages (e.g. perl) –libraries exist for other languages (Pattern and Matcher classes in Java) No standard notation –Many languages use Perl Compatible Regular Expressions Useful for describing things like identifiers and numbers for a programming language
9 Regular Expression Components Atoms - the characters that can be combined to make the pattern being described Concatenation - a sequence of atoms Alternation - a choice between several patterns Kleene closure (*) - 0 or more occurrences Positive closure (+) - 1 or more occurrences nothing ( )
10 Patterns and Matching a pattern is generally enclosed between a matched pair of characters, most commonly // –/pattern/ Languages that support pattern matching may have a match operator ~=, m//Perl !~~AWK No Match operator Match operator Language
11 Metacharacters Characters that have a special meaning within a pattern OR| 0 or 1 occurrences? 1 or more occurrences+ 0 or more occurrences* used to group characters() uses to enclose a character class [ ] matches end of string$ matches beginning of string^ escape character\ any single character.
12 Simple Examples A single character : / a/ –Matches any string that contains the letter a A sequence of characters –/ab/ matches any string that contains the letter a followed immediately by the letter b –/bird/ matches any string that contains the word bird –/Regular/ matches any string that contains the word Regular (matches are case-sensitive by default)
13 More Examples Any character : a. –a followed by any character A choice of two characters : a | b –a b ac ab bc but not cd ef Optional repeated character : ab* –a ab abb abbbb abracadabra Optional repeated sequence : a(bc)* –a abc abcbc At least one of a sequence : ab + –ab abb abbbb abracadabra
14 Character Classes You can put a set of characters inside square brackets to create a character class –[abc] means any one of a b or c A ^ as the first character means any character that isn't in the set –[^abc] means any character except a b or c You can also specify ranges of characters (based on ASCII codes) –[0-9] is any digit
15 Regular Expressions for String Manipulation split( regexp, string) tokenizes a string s/regexp/replacement/ substitutes for regexp –g at end means do all occurrences Expression memory allows you to remember what matches parts of pattern in parentheses
16 Regular Expressions in Java Java has classes for using regular expressions –The String class has a matches method parameter is a regular expression –The java.util.regex package has classes that can be used for pattern matching operations Pattern represents regular expressions Matcher creates an object that performs various pattern matching operations
Fall 2007CS Recursion and Grammars
Fall 2007CS Grammars and Recursion A grammar is a formal description of a language –For natural languages, this is hard –In order to be able to use a program to translate a program, programming languages need to be relatively simple. Regular expressions can be used to specify the simplest grammars BNF is a notation that was invented to describe programming languages
Fall 2007CS Backus-Naur Form Generally referred to as BNF Most widely known method for describing programming language syntax BNF description of a language consists of a set of rules that can be used to generate statements in the language
Fall 2007CS BNF Rules Left hand side of a rule is a non-terminals; something that is built from smaller pieces –there may be several rules for a single non-terminal Right hand side of a rule consists of terminals and other non-terminals in the order they need to occur –Terminals are typically keywords and symbols A grammar is a collection of rules
Fall 2007CS Sample Grammar -> -> aa -> a -> b is known as the start symbol
Fall 2007CS Derivations BNF is a generative device –Use a grammar to generate sentences that belong to the language the grammar describes A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)
Fall 2007CS Sample Derivations -> -> b -> -> a -> ab -> -> aa -> aaaa -> aaaaa -> aaaaab -> aaaaabb -> aaaaabbb
Fall 2007CS Extended BNF Optional parts are placed in brackets [ ] -> ident [( )] Alternative parts of RHSs are placed inside parentheses and separated via vertical bars → (+|-) const Repetitions (0 or more) are placed inside braces { } → letter {letter|digit}
Fall 2007CS Sample Grammar -> | -> aa | a -> b | b
Fall 2007CS What does this have to do with recursion? One of the common techniques for parsing a program (checking its syntax and putting it into a form that can be translated into an executable format) uses the BNF rules to implement a set of recursive methods
Fall 2007CS Recursive-Descent Parsing There is a method for each nonterminal in the grammar –If there are several rules, the method needs to determine which to use –The method checks that each element in the rule is present EBNF is ideally suited for being the basis for a recursive-descent parser, because EBNF minimizes the number of nonterminals
Fall 2007CS Recursive-Descent Methods For a single rule: –For each terminal symbol in the RHS, compare it with the next input token; if they match, continue, else there is an error –For each nonterminal symbol in the RHS, call its associated parsing subprogram
Fall 2007CS Recursive-Descent Methods A nonterminal that has more than one RHS requires an initial process to determine which RHS it is to parse –The correct RHS is chosen on the basis of the next token of input –The next token is compared with the first token that can be generated by each RHS until a match is found –If no match is found, it is a syntax error
Fall 2007CS Example Our sample grammar would have three methods –S() –A() –B() Algorithm for S() call A() call B()
Fall 2007CS Algorithm for A() checks for an a if present check for a second a if present call A else fail call B()
Fall 2007CS Algorithm for B() checks for an b if present check for a second b if present call B else fail
Fall 2007CS Finite State Machines Material from The Object of Data abstraction and structures using Java by David Riley
Fall 2007CS An entire collection of useful table-driven algorithms makes use of a theoretical concept known as a finite state machine (FSM). Example Algorithm Input a stream of “bits” (‘0’ or ‘1’ characters) from a text stream. Output to the standard output stream, according to the following rules: 1) Write an ‘X’ for every odd-numbered bit (i.e., 1st, 3rd, 5th, etc.) 2) Write a ‘Z’ for every even-numbered bit with value of ‘0’. 3) Write an ‘N’ for every even-numbered bit with value of ‘1’. Sample input stream resulting output 0 X ZXNXNXZXN Question How could we draw a picture of this algorithm? The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS A graphical model for the preceding algorithm is shown below. After Even After Odd 1 / X 0 / X 0 / Z 1 / N Notation Each circle is a separate state. Each arc represents a potential transition from one state to another. The label a / b denotes an input symbol of a and output of b. One state has an incoming arc without a state on the opposite end. This state is called the start state. The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X Z After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X Z After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZX After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZX After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXN After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXN After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNX After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNX After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXN After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXN After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXNX After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXNX After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXNXZ After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXNXZ After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXNXZX After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXNXZX After Even After Odd 1 X 0 X 0 Z 1 N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS Sample input stream resulting output 0 X ZXNXNXZXN After Even After Odd 1 / X 0 / X 0 / Z 1 / N The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS An (Finite State Machine) FSM is a 6-tuple: 1) A set of states 2) A start state 3) A set of input symbols 4) A set of output symbols 5) A next state function ( State Input State) 6) An output function ( State Input Output) { AfterEven, AfterOdd } AfterEven { 0, 1 } { N, X, Z } NextState (AfterEven, 0) AfterOdd NextState (AfterEven, 1) AfterOdd NextState (AfterOdd, 0) AfterEven NextState (AfterOdd, 1) AfterEven Output (AfterEven, 0) X Output (AfterEven, 1) X Output (AfterOdd, 0) Z Output (AfterOdd, 1) N After Even After Odd 1 / X 0 / X 0 / Z 1 / N Name the parts below Actions other than output are also permitted in some FSMs. The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Fall 2007CS How many states? 0 / 0 1 / 1 0 / 2 1 / 3 0 / 0 0 / 2 1 / 1 Which is the start state? What is the input alphabet? What is the output alphabet? Select meaningful state labels. The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.
Spring 2010CS A 4-Step Strategy 1) Select the states. 2) Identify the start state. 3) Complete the Next State function. 4) Complete the Output function. The Object of Data Abstraction and Structure, David D. Riley © Addison Wesley pub.