Presentation is loading. Please wait.

Presentation is loading. Please wait.

Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.

Similar presentations


Presentation on theme: "Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD."— Presentation transcript:

1 Syntactic analysis using Context Free Grammars

2 Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD the/DT room/NN – The/DT red/ADJ block/NN on/PREP the/DT blue/ADJ cylinder/NN was/AUX moved/VBD onto/PREP the/DT brown/ADJ table/NN Any further analysis?

3 Analysis of language Part Of Speech (POS) tagging – The/DT man/NN left/VBD the/DT room/NN – The/DT red/ADJ block/NN on/PREP the/DT blue/ADJ cylinder/NN was/AUX moved/VBD onto/PREP the/DT brown/ADJ table/NN Any further analysis? – chunks, clauses, syntax, semantics, word senses etc… Today’s lecture on analyzing syntax

4 What is Syntax? Study of structure of language – how words can connect to each other Specifically, goal is to relate surface form (i.e. the sentence) to semantics (the meaning) Representational device is tree structure

5 Structure in Strings Proposal 1 Some words: the a small nice big very boy girl sees likes Some good sentences: – (the) boy (likes a girl) – (the small) girl (likes the big girl) – (a very small nice) boy (sees a very nice boy) Some bad sentences: – *(the) boy (the girl) – *(small) boy (likes the nice girl)

6 Structure in Strings Proposal 2 Some words: the a small nice big very boy girl sees likes Some good sentences: – (the boy) likes (a girl) – (the small girl) likes (the big girl) – (a very small nice boy) sees (a very nice boy) Some bad sentences: – *(the boy) (the girl) – *(small boy) likes (the nice girl)

7 More Structure in Strings Proposal 2 -- ctd Some words: the a small nice big very boy girl sees likes Some good sentences: – ((the) boy) likes ((a) girl) – ((the) (small) girl) likes ((the) (big) girl) – ((a) ((very) small) (nice) boy) sees ((a) ((very) nice) girl) Some bad sentences: – *((the) boy) ((the) girl) – *((small) boy) likes ((the) (nice) girl)

8 From Substrings to Trees (((the) boy) likes ((a) girl)) boy the likes girl a

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27 27 Context-Free Grammars Terminals – This would be the lexicon/vocabulary Non-Terminals – The constituents in a language Like noun phrase, verb phrase, prepositional phrase and sentence Rules – Rules consist of a single non-terminal on the left and any number of terminals and non-terminals on the right. – Describe the allowed structure of the constituents – Express the ways in which symbols of the language can be grouped or ordered together

28

29 Phrase Structure Tree (((the/ Det ) boy/ N ) likes/ V ((a/ Det ) girl/ N )) boy the likes girl a DetP NP DetP S Phrase-structure tree nonterminal symbols = constituents terminal symbols = words

30 Context? The notion of context in CFGs is not the same as the ordinary meaning of the word context in language. All it really means is that the non-terminal on the left-hand side of a rule is out there all by itself – A -> B C – Means that I can rewrite an A as a B followed by a C regardless of the context in which A is found

31 CFG: Example Many possible CFGs for English, here is an example (fragment): – S  NP VP – VP  V NP – NP  DetP N | AdjP NP – AdjP  Adj | Adv AdjP – N  boy | girl – V  sees | likes – Adj  big | small – Adv  very – DetP  a | the the very small boy likes a girl

32 Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the S S

33 Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the NP VP NP S VP

34 Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the DetP N VP DetP NP S VP N

35 Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the the boy VP boy the DetP NP S VP N

36 Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the the boy likes NP boythelikes DetP NP S VP N V

37 Derivations in a CFG S  NP VP VP  V NP NP  DetP N | AdjP NP AdjP  Adj | Adv AdjP N  boy | girl V  sees | likes Adj  big | small Adv  very DetP  a | the the boy likes a girl boythelikes DetP NP girl a NP DetP S VP N N V

38 38 Simple lexicon

39 39 Simple grammar

40 40 Generativity We can view these rules as either analysis or synthesis machines – Generate strings in the language – Reject strings not in the language – Impose structures (trees) on strings in the language

41 41 A CFG defines a formal language Sentences (strings of words) that can be derived by the grammar are in the formal language defined by the grammar Sentences that cannot be derived from the grammar are not in the language – Ungrammatical

42 42 Derivations A derivation is a sequence of rules applied to a string that accounts for that string – Covers all the elements in the string – Covers only the elements in the string

43 Recursion We’ll have to deal with rules such as the following where the non-terminal on the left also appears somewhere on the right (directly). – NP -> NP PP[[The flight] [to Boston]] – VP -> VP PP[[departed Miami] [at noon]]

44 Recursion Of course, this is what makes syntax interesting – flights from Denver – Flights from Denver to Miami – Flights from Denver to Miami in February – Flights from Denver to Miami in February on a Friday – Flights from Denver to Miami in February on a Friday under $300 – Flights from Denver to Miami in February on a Friday under $300 with lunch

45 The Point If you have a rule like – VP -> V NP – It only cares that the thing after the verb is an NP. It doesn’t have to know about the internal affairs of that NP

46 The Point VP -> V NP I hate – flights from Denver – Flights from Denver to Miami – Flights from Denver to Miami in February – Flights from Denver to Miami in February on a Friday – Flights from Denver to Miami in February on a Friday under $300 – Flights from Denver to Miami in February on a Friday under $300 with lunch

47 Potential Problems in CFG Agreement Subcategorization Movement

48 48 Agreement By agreement, we have in mind constraints that hold among various constituents that take part in a rule or set of rules For example, in English, determiners and the head nouns in NPs have to agree in their number. This flight Those flights *This flights *Those flight

49 49 Problem Our earlier NP rules are clearly deficient since they don’t capture this constraint – NP  Det Nominal Accepts, and assigns correct structures, to grammatical examples (this flight) But its also happy with incorrect examples (*these flight) – Such a rule is said to overgenerate.

50 50 Verb Phrases English VPs consist of a head verb along with 0 or more following constituents which we’ll call arguments.

51 Subcategorization Sneeze: John sneezed Find: Please find [a flight to NY] NP Give: Give [me] NP [a cheaper fare] NP Help: Can you help [me] NP [with a flight] PP Prefer: I prefer [to leave earlier] TO-VP Told: I was told [United has a flight] S … *John sneezed the book *I prefer United has a flight *Give with a flight Subcat expresses the constraints that a predicate (verb for now) places on the number and type of the argument it wants to take

52 So? So the various rules for VPs overgenerate. – They permit the presence of strings containing verbs and arguments that don’t go together – For example – VP -> V NP therefore – Sneezed the book is a VP since “sneeze” is a verb and “the book” is a valid NP Subcategorization frames can fix this problem (“slow down” overgeneration) This is a modern take on the traditional notion of transitive/intransitive. Modern grammars may have 100s or such classes.

53 53 Subcategorization Sneeze: John sneezed Find: Please find [a flight to NY] NP Give: Give [me] NP [a cheaper fare] NP Help: Can you help [me] NP [with a flight] PP Prefer: I prefer [to leave earlier] TO-VP Told: I was told [United has a flight] S …

54 Movement Core example – [[My travel agent] NP [booked [the flight] NP ] VP ] S I.e. “book” is a straightforward transitive verb. It expects a single NP arg within the VP as an argument, and a single NP arg as the subject.

55 Movement What about? – Which flight do you want me to have the travel agent book? The direct object argument to “book” isn’t appearing in the right place. It is in fact a long way from where its supposed to appear. And note that its separated from its verb by 2 other verbs.

56 Grammar equivalence and normal form Strong equivalence: – two grammars are strongly equivalent if: they generate the same set of strings they assign the same phrase structure to each sentence – two grammars are weakly equivalent if: they generate the same set of strings they do not assign the same phrase structure to each sentence Normal form – Restrict the form of productions – Chomsky Normal Form (CNF) – Right hand side of the productions has either one or two terminals or non- terminals – e.g. A -> BC A -> a – Any grammar can be translated into a weakly equivalent CNF – A -> B C D A-> B X X -> C D

57 Building tree structures Draw tree structures for the following phrases Dallas from Denver arriving in Washington I need to fly between Philadelphia and Atlanta My flight from Philadelphia to Atlanta has been cancelled


Download ppt "Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD."

Similar presentations


Ads by Google