Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a compiler PART II:

Similar presentations


Presentation on theme: "1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a compiler PART II:"— Presentation transcript:

1 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a compiler PART II: inside a compiler 4Syntax analysis 5Contextual analysis 6Runtime organization 7Code generation PART III: conclusion 8Interpretation 9Review Supplementary material: Theoretical foundations (Finite-state machines)

2 2 Finite State Machines (aka Finite Automata) A FSM is similar to a compiler in that: –A compiler recognizes legal programs in some (source) language. –A finite-state machine recognizes legal strings in some language. Example: Pascal Identifiers –sequences of one or more letters or digits, starting with a letter: letter letter | digit S A

3 3 Finite State Machines viewed as Graphs A state The start state An accepting state A transition a

4 4 Finite State Machines Transition s 1 a > s 2 Is read In state s 1 on input “a” go to state s 2 If end of input –If in accepting state => accept –Otherwise => reject If no transition possible (got stuck) => reject

5 5 Language defined by FSM The language defined by a FSM is the set of strings accepted by the FSM. –Are in the language of the FSM shown above: x, mp2, XyZzy, position27. –Are not in the language of the FSM shown above: 123, a?, 13apples.

6 6 Example: Integer Literals FSM that accepts integer literals with an optional + or - sign: + digit S B A -

7 7 Formal Definition Each finite state machine is a 5-tuple ( , Q, , q, F) that consists of: –An input alphabet  –A set of states Q –A start state q –A set of accepting states (or final states) F  Q –  is a state transition function: Q x   Q that encodes transitions state i input > state j

8 8 State-Transition Function for the integer-literal example:  (S, +) = A  (S, –) = A  (S, digit) = B  (A, digit) = B  (B, digit) = B

9 9 FSM Examples A B 011 0 Accepts strings over alphabet {0,1} that end in 1

10 10 FSM Examples Accepts strings over alphabet {a,b} that begin and end with same symbol 5 3 4 2 1 a a a a a b b b b b

11 11 FSM Examples 2 Accepts strings over {0,1,2} such that sum of digits is a multiple of 3 1 0 01 2 0 1 2 0 1 2 Start

12 12 FSM Examples Accepts strings over {0,1} that have an odd number of ones Even Odd 00 1 1

13 13 FSM Examples Accepts strings over {0,1} that contain the substring 001 '001' 00 1 1 '0''00' 0 1 0,1

14 14 Examples Design a FSM to recognize strings with an equal number of ones and zeros. –Not possible Design a FSM to recognize strings with an equal number of substrings "01" and "10". –Perhaps surprisingly, this is possible

15 15 Accepts strings with an equal number of substrings "01" and "10" 0 0 0 0 0 1 1 1 1 1 FSM Examples

16 16 TEST YOURSELF Question 1: Draw a finite-state machine that accepts Java identifiers –one or more letters, digits, or underscores, starting with a letter or an underscore. Question 2: Draw a finite-state machine that accepts only Java identifiers that do not end with an underscore

17 17 TEST YOURSELF q0 q2 q3q1 1 1 1 1 0000 Question 3: What strings does this FSM accept? Describe the set of accepted strings in English.

18 18 Two kinds of Finite State Machines Deterministic (DFSM): –No state has more than one outgoing edge with the same label. [All previous FSM were DFSM.] Non-deterministic (NFSM): –States may have more than one outgoing edge with same label. –Edges may be labeled with  (epsilon), the empty string. [Note that some books use the symbol.] –The automaton can make an  epsilon transition without consuming the current input character.

19 19 Example of NFSM integer-literal example: + digit S B A - 

20 20 Example of NFSM Accepts strings over {0,1} that contain the substring 001 '001' 00 0,1 '0''00' 1 0,1

21 21 Non–deterministic finite state machines (NFSM) sometimes simpler than DFSM can be in multiple states at the same time NFSM accepts a string if –there exists a sequence of moves –starting in the start state, –ending in a final state, –that consumes the entire string. Examples: –Consider the integer-literal NFSM on input "+752" –Consider the second NFSM on input "10110001"

22 22 Equivalence of DFSM and NFSM Theorem: –For each non-deterministic finite state machine N, we can construct a deterministic finite state machine D such that N and D accept the same language. –[proof omitted] Theorem: –Every deterministic finite state machine can be regarded as a non–deterministic finite state machine that just doesn’t use the extra non– deterministic capabilities.

23 23 How to Implement a FSM A table-driven approach: Table: –one row for each state in the machine, and –one column for each possible character. Table[j][k] –which state to go to from state j on input character k, –an empty entry corresponds to the machine getting stuck.

24 24 The table-driven program for a DFSM state = S // S is the start state repeat { k = next character from the input if (k == EOF) then // end of input if (state is a final state) then accept else reject state = T[state][k] if (state = empty) then reject // got stuck }


Download ppt "1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a compiler PART II:"

Similar presentations


Ads by Google