Presentation is loading. Please wait.

Presentation is loading. Please wait.

LING 438/538 Computational Linguistics Sandiway Fong Lecture 5: 9/5.

Similar presentations


Presentation on theme: "LING 438/538 Computational Linguistics Sandiway Fong Lecture 5: 9/5."— Presentation transcript:

1 LING 438/538 Computational Linguistics Sandiway Fong Lecture 5: 9/5

2 Administrivia Homework 1 –due today by midnight your answers should come in one file your name should appear at the top acceptable formats: –.txt (plain text),.doc (Microsoft Word),.pdf (Adobe) –I will send out acknowledgment of receipt of homework tonight –I posted the following on the listserv see also Prolog Computation Summary slide for LING 388 Lecture 4 http://dingo.sbs.arizona. edu/~sandiway/ling388p /lecture4.pdf

3 So far... A quick introduction to Prolog –a new style of writing definitions that are also computer programs Concepts –database DB (assert facts, closed world assumption) –Prolog “if” rule (:-) –recursion (base case, recursive case) –atoms, variables, lists (comma-delimited, head/tail) –negation (“failure to prove”, no negative facts in DB) –computation tree (matching the DB, finite and infinite)

4 Today’s Topic Grammar rules –builtin Prolog support for writing grammars

5 Road Map This week –grammar rules –regular grammars Reading assignment –(looking ahead) –textbook –Chapter 2: Regular Expressions and Automata –pg. 22–56

6 Chomsky Hierarchy division of grammar into subclasses partitioned by “ generative power/capacity ” Type-0 General rewrite rules –Turing-complete, powerful enough to encode any computer program –can simulate a Turing machine –anything that ’ s “ computable ” can be simulated using a Turing machine –Type-1 Context-sensitive rules –weaker, but still very power – a n b n c n Type-2 Context-free rules weaker still a n b n Pushdown Automata (PDA) –Type-3 Regular grammar rules –very restricted – Regular Expressions a + b + – Finite State Automata (FSA) Turing machine: artist’s conception from Wikipedia tape read/write head finite state machine

7 Chomsky Hierarchy Regular Grammars FSA Regular Expressions DCG = Type-0 Type-3 Type-2 Type-1

8 Prolog Grammar Rule System known as “Definite Clause Grammars” (DCG) –based on type-2 (context-free grammars) –but with extensions –powerful enough to encode the hierarchy all the way up to type-0 –we’ll start with the bottom of the hierarchy i.e. the least powerful regular grammars (type-3)

9 DefiniteClause Grammars (DCG) some facts –built-in feature not an accident: Prolog was originally designed to (also) support natural language processing –notation resembles grammar rules used by linguists and computer scientists most importantly –uses the same Prolog computation rule –(you already know from homework 1) –match the database –from 1st rule on down...

10 DefiniteClause Grammars (DCG) background –a “typical” formal grammar contains 4 things – a set of non-terminal symbols (N) –appear on the left-hand-side (LHS) of production rules –these symbols will be expanded by the rules a set of terminal symbols (T) –appear on the right-hand-side (RHS) of production rules –consequently, these symbols cannot be expanded production rules (P) of the form –LHS  RHS –In regular and CF grammars, LHS must be a single non-terminal symbol –RHS: a sequence of terminal and non-terminal symbols: possibly with restrictions, e.g. for regular grammars a designated start symbol (S) –a non-terminal to start the derivation

11 DefiniteClause Grammars (DCG) Background a “typical” formal grammar contains 4 things –a set of non-terminal symbols (N) –a set of terminal symbols (T) –production rules (P) of the form LHS  RHS –a designated start symbol (S) Example S  aB B  aB B  bC B  b C  bC C  b Notes: Start symbol: S Non-terminals: {S,B,C} (uppercase letters) Terminals: {a,b} (lowercase letters)

12 DefiniteClause Grammars (DCG) Example –Formal grammarProlog format –S  aB s --> [a],b. –B  aB b --> [a],b. –B  bC b --> [b],c. –B  b b --> [b]. –C  bC c --> [b],c. –C  b c --> [b]. Notes: –Start symbol: S –Non-terminals: {S,B,C} (uppercase letters) –Terminals: {a,b} (lowercase letters) Prolog format: both terminals and non- terminal symbols begin with lowercase letters –Variables begin with an uppercase letter (or underscore) --> is the rewrite symbol terminals are enclosed in square brackets (list notation) nonterminals don ’ t have square brackets surrounding them the comma (,: and) represents the concatenation symbol a period (. ) is required at the end of every DCG rule

13 DefiniteClause Grammars (DCG) What language does our grammar generate? by writing the grammar in Prolog, we have a ready-made recognizer program –no need to write a separate parser (in this case) Example queries –?- s([a,a,b,b,b],[]). Yes –?- s([a,b,a],[]). No Note: –Query uses the start symbol s with two arguments: –(1) sequence (as a list) to be recognized and –(2) the empty list [] One or more a’s followed by one or more b’s

14 DefiniteClause Grammars (DCG) (simplified) computation tree –?- s([a,a,b,b,b],[]). (rule 1) ?- b([a,b,b,b],[]). (rule 2) –?- b([b,b,b],[]). (rule 3) »?- c([b,b],[]). (rule 5) » ?- c([b],[]). (rule 6) » Yes –?- s([a,b,a],[]). (rule 1) ?- b([b,a],[]). (rule 3 ) –?- c([a],[]). –No Prolog’s computation rule allows it to handle ambiguity –; looks for more possible derivations 1. s --> [a],b. 2. b --> [a],b. 3. b --> [b],c. 4. b --> [b]. 5. c --> [b],c. 6. c --> [b].

15 DefiniteClause Grammars (DCG) Language: Sheeptalk –ba! –baa! –baaa! –ba...a! not in language –b! –bab! –!

16 DefiniteClause Grammars (DCG) language: Sheeptalk –ba! –baa! –baaa! –ba...a! grammar –s --> [b], a, [‘!’]. –a --> [a]. (base case) –a --> a, [a]. (recursive case) BTW, you can’t assert grammar rules ?- assert(a --> [b]). Yes ?- listing. :- dynamic (-->)/2. a-->[b]. need to load them in from a file!

17 DefiniteClause Grammars (DCG) computation rule also allows for infinite loops –just like with regular Prolog rules example (sheeptalk) –s --> [b], a, [‘!’]. –a --> a, [a]. (left recursive rule) –a --> [a]. (simplified) computation tree –?- s([b,a,a,’!’],[]). ?- a([a,a,’!’],L). ?- L - [] = [‘!’]. (pseudo-code, ‘ - ’ is list difference,TBE) –?- a([a,a,’!’],L). (left recursive rule) ? - a([a,a,’!’],L’). (left recursive rule) –?- a([a,a,’!’],L”). (left recursive rule) »… try grammar with 3rd rule coming before 2nd rule or substituting the right recursive counterpart of rule 2 a --> [a], a.

18 Definite Clause Grammars (DCG) other features –right-hand-side can be empty e.g. a --> []. useful for grammar writing: encode empty categories e.g. NP  ε NP  λ np --> []. –no designated start symbol e.g. subgrammar c from the first example generates the language: one or more b’s other features –add arguments to non-terminals e.g. s(N) --> [a], b(1,N). can pass arguments up and down the derivation construct parse trees check feature structures –insert regular Prolog queries into the middle of a grammar rule using { } e.g. b(M,N) --> [a], {addOne(M,P)}, b(P,N). allows us to insert code to count the number of a’s here or compute anything we like … general rewrite rule (type-0) power

19 Definite Clause Grammars (DCG) the relation between DCG rules and Prolog is one of –“syntactic sugar” –notation is for convenience of the grammar rule writer only actually, DCG rules are translated automatically into regular Prolog rules when loaded into the database computation rule works on the translated rules directly –not the DCG rules this is (unfortunately) non-transparent –using the debugger, you will see the translated rules rather than the easy-to-read source DCG rules

20 Definite Clause Grammars (DCG) each DCG rule corresponds to a Prolog rule –named using the left-hand-side of the DCG rule example: –s --> [a], b. –becomes –s(L1,L3) :- ‘C’(L1,a,L2), b(L2,L3). –where L1, L2, L3 are lists S spans the difference between L1 and L3 [a] spans the difference between L1 and L2 b spans the difference between L2 and L3 [a],b spans the difference between L1 and L3 (same as S ) ‘C’(L1,a,L2) refers to a SWI-Prolog built-in named ‘C’ that holds if a is the difference between L1 and L2


Download ppt "LING 438/538 Computational Linguistics Sandiway Fong Lecture 5: 9/5."

Similar presentations


Ads by Google