Download presentation
Presentation is loading. Please wait.
Published byTyrone Wilkinson Modified over 9 years ago
1
Natural Language Processing CS480/580
2
Levels of Linguistic Analysis Phonology---recognize speech sounds Morphology---analysis of word forms (e.g., adding s to make a plural etc.) Syntax---sentence structure Semantics---meaning Pragmatics---relation of language to context
3
Tokenization A string broken into words, punctuations removed, and key information represented as a sequence of words or tokens. E.g., “How are you today?” is converted to [how, are, you, today].
4
Tokenize.pl lower_case(A, B) :- A>=65, A=<90, !, B is A+32. lower_case(A, A). tokenize([], []) :- !. tokenize(A, [B|E]) :- grab_word(A, C, D), name(B, C), tokenize(D, E). punctuation_mark(A) :- A=<47. punctuation_mark(A) :- A>=58, A=<64. punctuation_mark(A) :- A>=91, A=<96. punctuation_mark(A) :- A>=123. grab_word([32|A], [], A) :- !. grab_word([], [], []). grab_word([A|B], C, D) :- punctuation_mark(A), !, grab_word(B, C, D). grab_word([D|A], [E|B], C) :- grab_word(A, B, C), lower_case(D, E). tokenize("This is CS480/580 course", X). X = [this, is, cs480580, course]. name(john,X). X = [106, 111, 104, 110].
5
Template System Templates --- stored sentence patterns Each template is accompanied by a translation schema E.g., [X, is, a, Y] is translated to Y(X). process([X, is, a, Y]) :- Fact =.. [Y, X], assert(Fact). Process([is, X, a T]) :- Query =.. [Y, X], call(Query).
6
Template.pl grab_word([32|A], [], A) :- !. grab_word([], [], []). grab_word([A|B], C, D) :- punctuation_mark(A), !, grab_word(B, C, D). grab_word([D|A], [E|B], C) :- grab_word(A, B, C), lower_case(D, E). punctuation_mark(A) :- A=<47. punctuation_mark(A) :- A>=58, A=<64. punctuation_mark(A) :- A>=91, A=<96. punctuation_mark(A) :- A>=123. lower_case(A, B) :- A>=65, A=<90, !, B is A+32. lower_case(A, A). write_str([A|B]) :- put(A), write_str(B). write_str([]). read_str_aux(-1, []) :- !. read_str_aux(10, []) :- !. read_str_aux(13, []) :- !. read_str_aux(A, [A|B]) :- read_str(B). do_one_sentence :- write(>), read_str(A), tokenize(A, B), process(B). note(A) :- asserta(A), write('OK'), nl. read_atom(A) :- read_str(B), name(A, B). start :- write('TEMPLATE.PL at your service.'), nl, write('Terminate by pressing Break.'), nl, repeat, do_one_sentence, fail. check(A) :- call(A), !, write('Yes.'), nl. check(_) :- write('Not as far as I know.'), nl. read_num(A) :- read_str(B), name(A, B).
7
remove_s(A, C) :- name(A, B), remove_s_list(B, D), name(C, D). read_str(B) :- get0(A), read_str_aux(A, B). remove_s_list([115], []). remove_s_list([A|B], [A|C]) :- remove_s_list(B, C). process([B, is, a, A]) :- !, C=..[A, B], note(C). process([A, is, an, B]) :- !, process([A, is, a, B]). process([is, B, a, A]) :- !, C=.. [A, B], check(C). process([is, A, an, B]) :- !, process([is, A, a, B]). process([A, are, B]) :- !, remove_s(A, D), remove_s(B, C), F=..[C, E], G=..[D, E], note((F:-G)). process([does, B, A]) :- !, C=..[A, B], check(C). process([A, B]) :- \+ remove_s(A, _), remove_s(B, C), !, D=..[C, A], note(D). process([A, B]) :- remove_s(A, C), \+ remove_s(B, _), !, E=..[B, D], F=..[C, D], note((E:-F)). process(_) :- write('I do not understand.'), nl. tokenize([], []) :- !. tokenize(A, [B|E]) :- grab_word(A, C, D), name(B, C), tokenize(D, E). start. TEMPLATE.PL at your service. Terminate by pressing Break. >CS480 is a course. OK >is CS480 a course? Yes. >is cs471 a course? Not as far as I know. >cs471 is a course. OK >is cs471 a course? Yes.
8
Generative Grammars Templates are inadequate to describe human language (in the last example only sentences that were allowed was X is a Y.) John arrived Max said John arrived Bill claimed Max said John arrived Mary thought Bill claimed Max said John arrived Chomsky’s suggestion: Treat syntax as a problem in set theory---express infinite set as a finite description
9
Context Free Grammars Phrase Structure Rules – S NP VP – NP Det N – N N PP – N N N – PP P NP – VP IV VP TV NP VP DV NP NP Lexical Entries – N book, cow, course, … – P in, on, with, … – Det the, every, … – IV ran, hid, … – TV likes, hit, … – DV gave, showed Noam Chomsky
10
Context-Free Derivations S NP VP Det N VP the N VP the kid VP the kid IV the kid ran Penn TreeBank bracketing notation (Lisp-like) – (S (NP (Det the) (N kid)) (VP (IV ran))) Theorem: A sequence has a derivation if and only if it has a parse tree
11
“Standard” Parse Tree Notation
12
A simple Parser verb_phrase(A, C) :- verb(A, B), noun_phrase(B, C). verb_phrase(A, C) :- verb(A, B), sentence(B, C). determiner([the|A], A). determiner([a|A], A). sentence(A, C) :- noun_phrase(A, B), verb_phrase(B, C). noun_phrase(A, C) :- determiner(A, B), noun(B, C). noun([dog|A], A). noun([cat|A], A). noun([boy|A], A). noun([girl|A], A). verb([chased|A], A). verb([saw|A], A). verb([said|A], A). verb([believed|A], A). 2 ?- sentence([the, cat, saw, the, dog], []). true. 3 ?- sentence([the, dog, saw, the, dog], []). true. 4 ?- sentence([a, dog, chased, the, cat], []). true. 5 ?- sentence([that, dog, chased, the, cat], []). false.
13
Definite Clause Grammar (DCG) This is a Prolog notation to provide an easy way to write grammar rules. E.g., sentence non_phrase, verb_phrase. This is equivalent to the rule: – sentence(X,Z) :- noun_phrase(X,Y), verb_phrase(Y,Z). Also, noun [dog] or noun [dog] [cat]; [boy]; [girl] or verb [gives, up] where “gives up” is a single verb. A query to the above sentence rule will be sentence/2 E.g., sentence([the dog, chased, the, cat],[]). Try sentence([A,B,C,D,E],[]) or sentence([the, A, B, C, cat|E],[]). Non-terminal symbols can also take arguments: e.g., sentence(N) noun_phrase(N), verb_phrase(N).
14
Parser2.pl based on DCG sentence --> noun_phrase, verb_phrase. noun_phrase --> determiner, noun. verb_phrase --> verb, noun_phrase. verb_phrase --> verb, sentence. determiner --> [the]. determiner --> [a]. noun --> [dog]; [cat]; [boy]; [girl]. verb --> [chased]; [saw]; [said]; [believed]. verb --> [saw]. verb --> [said]. verb --> [believed].
15
Grammatical Features How to handle agreement in tense and number between the noun and the verb? sentence(N) --> noun_phrase(N), verb_phrase(N). noun_phrase(N) --> determiner(N), noun(N). verb_phrase(N) --> verb(N), noun_phrase(_). verb_phrase(N) --> verb(N), sentence. determiner(singular) --> [a]. determiner(_) --> [the]. determiner(plural) --> []. noun(singular) --> [dog];[cat];[boy];[girl]. noun(plural) --> [dogs];[cats];[boys];[girls]. verb(singular) --> [chases];[sees];[says];[believes]. verb(plural) --> [chase];[see];[say];[believe].
16
sentence(plural, [the, dogs, A, B, C],[]). A = chase, B = a, C = dog ; A = chase, B = a, C = cat ; A = chase, B = a, C = boy ; A = chase, B = a, C = girl ; A = chase, B = the, C = dog
17
Morphology How to generate plural nouns from singular? How to generate third person singular verbs from plural verbs? Mostly by adding: s
18
Sentence(N) --> noun_phrase(N), verb_phrase(N). noun_phrase(N) --> determiner(N), noun(N). verb_phrase(N) --> verb(N), noun_phrase(_). verb_phrase(N) --> verb(N), sentence. determiner(singular) --> [a]. determiner(_) --> [the]. determiner(plural) --> []. noun(N) --> [X], { morph(noun(N),X) }. verb(N) --> [X], { morph(verb(N),X) }. morph(noun(singular),dog). % Singular nouns morph(noun(singular),cat). morph(noun(singular),boy). morph(noun(singular),girl). morph(noun(singular),child). morph(noun(plural),children). % Irregular plural nouns morph(noun(plural),X) :- % Rule for regular plural nouns remove_s(X,Y), morph(noun(singular),Y). morph(verb(plural),chase). % Plural verbs morph(verb(plural),see). morph(verb(plural),say). morph(verb(plural),believe). morph(verb(singular),X) :- % Rule for singular verbs remove_s(X,Y), morph(verb(plural),Y). % remove_s(+X,-X1) [lifted from TEMPLATE.PL] % removes final S from X giving X1, % or fails if X does not end in S. remove_s(X,X1) :- name(X,XList), remove_s_list(XList,X1List), name(X1,X1List). remove_s_list("s",[]). remove_s_list([Head|Tail],[Head|NewTail]) :- remove_s_list(Tail,NewTail).
19
morph(verb(plural),chase). % Plural verbs morph(verb(plural),see). morph(verb(plural),say). morph(verb(plural),believe). morph(verb(singular),X) :- % Rule for singular verbs remove_s(X,Y), morph(verb(plural),Y). % remove_s(+X,-X1) [lifted from TEMPLATE.PL] % removes final S from X giving X1, % or fails if X does not end in S. remove_s(X,X1) :- name(X,XList), remove_s_list(XList,X1List), name(X1,X1List). remove_s_list("s",[]). remove_s_list([Head|Tail],[Head|NewTail]) :- remove_s_list(Tail,NewTail).
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.