Download presentation

Published byKole Turtle Modified over 4 years ago

1
**Regular Languages Regular Expressions Finite-State Automata**

Torbjörn Lager, Stockholm University Regular Languages Regular Expressions Finite-State Automata Välkommen. Fokus på verktyg? Nej! Inte så mycket morfologi

2
**Languages Sets of strings Example: {“ac” “abc” “abbc” “abbbc” ...}**

FST - Torbjörn Lager, UU

3
**Finite-State Automata**

Directed graphs consisting of states, and arcs labelled with symbols. For example: 2 1 a c b FST - Torbjörn Lager, UU

4
**Regular expressions Example: [a b* c] Note:**

Atomic symbols (e.g. ‘a’) denote languages (e.g. {“a”}) Operations (here: concatenation and kleene-star) are operations over languages [a b* c] FST - Torbjörn Lager, UU

5
**Regular expressions, regular languages and automata**

denote compile into Regular languages Finite-state automata generate accept FST - Torbjörn Lager, UU

6
**Regular expressions, regular languages and automata**

[a b* c] denotes compiles into b {“ac” “abc” “abbc” ...} a c 1 2 generates accepts FST - Torbjörn Lager, UU

7
**Regular expression operators**

Concatenation A B Union A | B Iteration (Kleene-star) A* Difference A - B Intersection A & B Grouping of expressions [A] FST - Torbjörn Lager, UU

8
**Examples a b {“ab”} a [b|c] {“ab” “ac”} a* [b|c]* {“” “a” “ab” ... }**

[a|b] & [b|c] {“b”} [a|b] - b {“a”} [a|b] - [b|a] {} FST - Torbjörn Lager, UU

9
**Special symbols ? The any symbol ?* The universal language**

[] The empty-string language 0 (or “”) The empty string (epsilon) FST - Torbjörn Lager, UU

10
**Regular expression operators**

Optionality (A) Kleene-plus A+ Complement ~A Containment $A Restriction A => B _ C FST - Torbjörn Lager, UU

11
**Examples a (b) c {“ac” “abc”} a b+ c {“abc” “abbc” “abbbc” ...}**

~[a b c] {“” ... “a” ... “ab” ... “abca” ..} $[a|b] {“a” ... “abba” ... “abcd” ...} b => a _ c {“” ... “a” ... “ccc” ... “abc” ..} FST - Torbjörn Lager, UU

12
**Component technologies in FST**

Word lists and lexica Tokenisers Morphological analysers Part-of-speech taggers Parsers detta är sånt som vi ska göra under laborationer FST - Torbjörn Lager, UU

13
**Applications of FST Named-entity recognition Information extraction**

Corpus linguistics Spelling- and grammar checking Speech processing applications FST - Torbjörn Lager, UU

14
**The Xerox Finite-State Tool**

Compiles extended regular expressions into finite-state machines (automata and transducers) Allows the user to display, examine and modify the machines FST - Torbjörn Lager, UU

15
**Non-deterministic FSAs**

At least one state has more than one transition leading from it labelled with the same symbol b a b 1 2 FST - Torbjörn Lager, UU

16
**Determinization of FSAs**

Any non-deterministic FSA can be transformed into an equivalent deterministic FSA. Example: Determinize for efficiency! b b a b a b 1 2 1 2 FST - Torbjörn Lager, UU

17
Minimization of FSAs Any (deterministic) FSA can be transformed into an equivalent FSA that has a minimal number of states. Minimize for space! Storleken på automaten växer i värsta fall linjärt med storleken på språket FST - Torbjörn Lager, UU

18
**Representing word lists**

Think of a word list as a regular language Use the calculus of regular expressions to query and update the wordlist Determinize for speed! Minimize for space! FST - Torbjörn Lager, UU

19
**Various equivalences (A) = A|[] A+ = A A* A+ = A* - [] A - B = A & ~B**

De två sista är De-morgans lagar! FST - Torbjörn Lager, UU

20
**Various equivalences A - A = ~[?*] A | ~[?*] = A A [] = A**

FST - Torbjörn Lager, UU

21
**Important theoretical results**

Kleene’s theorem (concerning FSAs) Closure properties of regular languages and regular relations Decidability FST - Torbjörn Lager, UU

22
**FSAs and regular expressions**

Kleene’s theorem: Any language recognised by an FSA is denoted by a regular expression and any language denoted by a regular expression can be recognised by a FSA. FST - Torbjörn Lager, UU

23
**Nondeterministic FSA with epsilon transitions**

Regex to FSA to regex Nondeterministic FSA Deterministic FSA Nondeterministic FSA with epsilon transitions Regular expressions Picture adapted from Hopcroft & Ullman 1979 FST - Torbjörn Lager, UU

24
**From regular expressions to finite-state automata**

The only really necessary operators: Disjunction Concatenation Iteration Sidenote: Compare regular grammars: A --> x B A --> x (where A and B are nonterminals, and where x is a sequence of terminals) FST - Torbjörn Lager, UU

25
**Closure properties of regular languages**

A set is said to be closed under an operation iff applying the operation to members of the set will never take us outside the set Example: if A and B are regular languages, then [A|B] is always regular. Therefore regular languages are closed under union. FST - Torbjörn Lager, UU

26
**Decidability Given one automaton A: Given two automata A1 and A2:**

Is the string S a string in L(A) ? Does L(A) contain any strings at all ? Is L(A) equivalent to ?* ? Given two automata A1 and A2: Is L(A1) a subset of L(A2) ? Are L(A1) and L(A2) equivalent ? Do L(A1) and L(A2) overlap ? FST - Torbjörn Lager, UU

27
FST - Torbjörn Lager, UU

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google