Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular Languages Regular Expressions Finite-State Automata Torbjörn Lager, Stockholm University.

Similar presentations


Presentation on theme: "Regular Languages Regular Expressions Finite-State Automata Torbjörn Lager, Stockholm University."— Presentation transcript:

1 Regular Languages Regular Expressions Finite-State Automata Torbjörn Lager, Stockholm University

2 FST - Torbjörn Lager, UU 2 Languages zSets of strings zExample: {“ac” “abc” “abbc” “abbbc”...}

3 FST - Torbjörn Lager, UU 3 Finite-State Automata zDirected graphs consisting of states, and arcs labelled with symbols. For example: 2 01 a c b

4 FST - Torbjörn Lager, UU 4 Regular expressions zExample: zNote: yAtomic symbols (e.g. ‘a’) denote languages (e.g. {“a”}) yOperations (here: concatenation and kleene-star) are operations over languages [a b* c]

5 FST - Torbjörn Lager, UU 5 Regular expressions, regular languages and automata Regular expressions Finite-state automata Regular languages compile into generate accept denote

6 FST - Torbjörn Lager, UU 6 Regular expressions, regular languages and automata compiles into generates accepts denotes 2 01 a c b [a b* c] {“ac” “abc” “abbc”...}

7 FST - Torbjörn Lager, UU 7 Regular expression operators zConcatenation A B zUnion A | B zIteration (Kleene-star) A* zDifference A - B zIntersection A & B zGrouping of expressions[A]

8 FST - Torbjörn Lager, UU 8 Examples za b{“ab”} za [b|c]{“ab” “ac”} za* [b|c]*{“” “a” “ab”... } z[a|b] & [b|c]{“b”} z[a|b] - b{“a”} z[a|b] - [b|a]{}

9 FST - Torbjörn Lager, UU 9 Special symbols z?The any symbol z?*The universal language z[]The empty-string language z0 (or “”)The empty string (epsilon)

10 FST - Torbjörn Lager, UU 10 Regular expression operators zOptionality(A) zKleene-plus A+ zComplement ~A zContainment $A zRestriction A => B _ C

11 FST - Torbjörn Lager, UU 11 Examples za (b) c {“ac” “abc”} za b+ c {“abc” “abbc” “abbbc”...} z~[a b c] {“”... “a”... “ab”... “abca”..} z$[a|b] {“a”... “abba”... “abcd”...} zb => a _ c {“”... “a”... “ccc”... “abc”..}

12 FST - Torbjörn Lager, UU 12 Component technologies in FST zWord lists and lexica zTokenisers zMorphological analysers zPart-of-speech taggers zParsers

13 FST - Torbjörn Lager, UU 13 Applications of FST zNamed-entity recognition zInformation extraction zCorpus linguistics zSpelling- and grammar checking zSpeech processing applications

14 FST - Torbjörn Lager, UU 14 The Xerox Finite-State Tool zCompiles extended regular expressions into finite-state machines (automata and transducers) zAllows the user to display, examine and modify the machines

15 FST - Torbjörn Lager, UU 15 Non-deterministic FSAs zAt least one state has more than one transition leading from it labelled with the same symbol 01 a b b 2

16 FST - Torbjörn Lager, UU 16 Determinization of FSAs zAny non-deterministic FSA can be transformed into an equivalent deterministic FSA. zExample: zDeterminize for efficiency! 2 01 a b b 2 01 a b b

17 FST - Torbjörn Lager, UU 17 Minimization of FSAs zAny (deterministic) FSA can be transformed into an equivalent FSA that has a minimal number of states. zMinimize for space!

18 FST - Torbjörn Lager, UU 18 Representing word lists zThink of a word list as a regular language zUse the calculus of regular expressions to query and update the wordlist zDeterminize for speed! zMinimize for space!

19 FST - Torbjörn Lager, UU 19 Various equivalences z(A) = A|[] zA+ = A A* zA+ = A* - [] zA - B = A & ~B z~A = ?* - A z$A = ?* A ?* z~[A | B] = ~A & ~B z~[A & B] = ~A | ~B

20 FST - Torbjörn Lager, UU 20 Various equivalences zA - A = ~[?*] zA | ~[?*] = A zA [] = A zA ~[?*] = ~[?*] zA & ?* = A zA | ?* = ?*

21 FST - Torbjörn Lager, UU 21 Important theoretical results zKleene’s theorem (concerning FSAs) zClosure properties of regular languages and regular relations zDecidability

22 FST - Torbjörn Lager, UU 22 FSAs and regular expressions zKleene’s theorem: Any language recognised by an FSA is denoted by a regular expression and any language denoted by a regular expression can be recognised by a FSA.

23 FST - Torbjörn Lager, UU 23 Regex to FSA to regex Regular expressions Nondeterministic FSA Nondeterministic FSA with epsilon transitions Deterministic FSA Picture adapted from Hopcroft & Ullman 1979

24 FST - Torbjörn Lager, UU 24 From regular expressions to finite-state automata zThe only really necessary operators: yDisjunction yConcatenation yIteration zSidenote: Compare regular grammars: yA --> x B A --> x (where A and B are nonterminals, and where x is a sequence of terminals)

25 FST - Torbjörn Lager, UU 25 Closure properties of regular languages zA set is said to be closed under an operation iff applying the operation to members of the set will never take us outside the set zExample: if A and B are regular languages, then [A|B] is always regular. Therefore regular languages are closed under union.

26 FST - Torbjörn Lager, UU 26 Decidability zGiven one automaton A: yIs the string S a string in L(A) ? yDoes L(A) contain any strings at all ? yIs L(A) equivalent to ?* ? zGiven two automata A 1 and A 2 : yIs L(A 1 ) a subset of L(A 2 ) ? yAre L(A 1 ) and L(A 2 ) equivalent ? yDo L(A 1 ) and L(A 2 ) overlap ?

27 FST - Torbjörn Lager, UU 27


Download ppt "Regular Languages Regular Expressions Finite-State Automata Torbjörn Lager, Stockholm University."

Similar presentations


Ads by Google