Presentation is loading. Please wait.

Presentation is loading. Please wait.

Department of Software & Media Technology

Similar presentations


Presentation on theme: "Department of Software & Media Technology"— Presentation transcript:

1 Department of Software & Media Technology
Scanning, or Lexical Analysis. Regular Grammars Non-terminals (arbitrary names) Terminals (characters) Productions limited to the following: Non-terminal ::= terminal Non-terminal ::= terminal Non-terminal Treat character class (e.g. digit) as terminal Regular grammars cannot count: cannot express size limits on identifiers, literals Cannot express proper nesting (parentheses) 8 January 2004 Department of Software & Media Technology

2 Department of Software & Media Technology
Regular Grammars grammar for real literals with no exponent digit :: = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 REALVAL ::= digit REALVAL1 REALVAL1 ::= digit REALVAL (arbitrary size) REALVAL1 ::= . INTEGERVAL INTEGERVAL ::= digit INTEGERVAL (arbitrary size) INTEGERVAL ::= digit Start symbol is ? 8 January 2004 Department of Software & Media Technology

3 Department of Software & Media Technology
Regular Expressions RE are defined by an alphabet (terminal symbols) and three operations: Alternation RE1 | RE2 Concatenation RE1 RE2 Repetition RE* (zero or more RE’s) Language of RE’s = regular grammars Regular expressions are more convenient for some applications 8 January 2004 Department of Software & Media Technology

4 Finite State Machines or Finite Automata (FSM or FA)
A language defined by a grammar is a (possibly infinite) set of strings An automaton is a computation that determines whether a given string belongs to a specified language A finite state machine (FSM) is an automaton that recognize regular languages (regular expressions) Simplest automaton: memory is single number (state) 8 January 2004 Department of Software & Media Technology

5 Specifying an Finite State Machine (FA)
A set of labeled states, directed arcs between states labeled with character One or more states may be terminal (accepting) Start is a distinguished state Automaton makes transition from state S1 to S2 If and only if arc from S1 to S2 is labeled with next character in input Token is legal if automaton stops on terminal state 8 January 2004 Department of Software & Media Technology

6 Department of Software & Media Technology
FA from Grammar One state for each non-terminal A rule of the form Nt1 ::= terminal, generates transition from a state to final state Nt1 ::= terminal Nt2 Generates transition from state 1 to state 2 on an arc labeled by the terminal 8 January 2004 Department of Software & Media Technology

7 Graphic representation of FA
digit letter underscore identifier 8 January 2004 Department of Software & Media Technology

8 Department of Software & Media Technology
FA from RE Each RE corresponds to a grammar For all REs A natural translation to FSM exists Alternation often leads to non-deterministic machines 8 January 2004 Department of Software & Media Technology

9 Deterministic Finite Automata (DFA)
For all states S For all characters C There is at most one arc from any state S that is labeled with C Easier to implement No backtracking Conventions for DFA: Error transitions are not explicitly shown Input symbols that result in the same transition are grouped together (this set can even be given a name) Still not displayed: stopping conditions and actions 8 January 2004 Department of Software & Media Technology

10 Non-Deterministic Finite Automata (NFA)
A non-deterministic FA Has at least one state With two arcs to two distinct states Labeled with the same character Example: from start state, a digit can begin an integer literal or a real literal Implementation requires backtracking 8 January 2004 Department of Software & Media Technology

11 Lookahead & Backtracking in NFA
letter start in_id [other] return id finish digit 8 January 2004 Department of Software & Media Technology

12 Department of Software & Media Technology
Implementation of FA letter start in_id [other] return id finish digit 8 January 2004 Department of Software & Media Technology

13 Department of Software & Media Technology
From RE to DFA & RE to NFA letter start in_id [other] return id finish digit 8 January 2004 Department of Software & Media Technology

14 Department of Software & Media Technology
NFA to DFA There is an algorithm for converting a non-deterministic machine to a deterministic one Result may have exponentially more states Intuitively: need new states to express uncertainty about token: int or real Other algorithms for minimizing number of states of FSM, for showing equivalence, etc. 8 January 2004 Department of Software & Media Technology

15 Department of Software & Media Technology
Example DFA 8 January 2004 Department of Software & Media Technology

16 Another view of the same DFA
8 January 2004 Department of Software & Media Technology

17 Yet another view of the same DFA
8 January 2004 Department of Software & Media Technology

18 State Minimization in DFA
8 January 2004 Department of Software & Media Technology

19 Department of Software & Media Technology
TINY DFA: 8 January 2004 Department of Software & Media Technology

20 Department of Software & Media Technology
Lex for Scanner Lex Conventions for RE Format of a Lex Input File 8 January 2004 Department of Software & Media Technology


Download ppt "Department of Software & Media Technology"

Similar presentations


Ads by Google