# CS 4705 Final Review CS4705 Julia Hirschberg. Format and Coverage Covers only material from thru (i.e. beginning with Probabilistic Parsing) Same format.

## Presentation on theme: "CS 4705 Final Review CS4705 Julia Hirschberg. Format and Coverage Covers only material from thru (i.e. beginning with Probabilistic Parsing) Same format."— Presentation transcript:

CS 4705 Final Review CS4705 Julia Hirschberg

Format and Coverage Covers only material from thru (i.e. beginning with Probabilistic Parsing) Same format as midterm: –Short answers: 2-3 sentences –True/False: for false statements provide true correction that is not just the negation of the false statement, e.g.

–Good answer: The exam is on Dec 14. FALSE! The exam is on Dec 16. –Bad answer: The exam is on Dec 14. FALSE! The exam is not on Dec 14.. Exercises Short essays: 2 essays, 3-5 paragraphs each The final will be only slightly longer than the midterm, although you will have the full 3h to complete it.

Probabilistic Parsing Problems with CFGs: –Rules unordered, many possible parses Solutions: –Weight the rules by their probabilities –But rules aren’t sensitive to lexical items or subcategorization frames –Add headwords to trees –Add subcategorization probabilities –Add complement/adjunct distinction –Etc.

Semantics Meaning Representations –Predicate/argument structure and FOPC –Problems with mapping to NL (e.g. and  ^) Frame semantics Having Haver: S HadThing: Car –Problems with reasoning from representation

Subcategorization Frames and Thematic Roles What patterns of arguments can different verbs take? –NP likes NP –NP likes Inf-VP –NP likes NP Inf-VP What roles can arguments take? –Agent, Patient, Theme (The ice melted), Experiencer (Bill likes pizza), (Bill likes pizza), Stimulus (Bill likes pizza), Goal (Bill ran to Copley Square), Recipient (Bill gave the book to Mary), Instrument (Bill ate the burrito with a plastic spork), Location (Bill sits under the tree on Wednesdays)

Selectional Restrictions George assassinated the senator. ?The spider assassinated the fly *Cain assassinated Able. George broke the bank.

Lexical Semantics Lexemes Lexicon Wordnet: synsets Framenet: subcategorization frames/verb semantics

Word Relations Types of word relations –Homonymy: bank/bank –Homophones: red/read –Homographs: bass/bass –Polysemy: bank/sperm bank –Synonymy: big/large –Hyponym/hypernym: poodle/dog –Metonymy: (printing press)/the press –Meronymy: (wheel)/car –Metaphor: Nothing scares Google.

Word Sense Disambiguation Time flies like an arrow. Tasks: all-words vs. lexical sample Techniques: –Supervised, semi-supervised bootstrapping, unsupervised –Corpora needed –Features that are useful –Competitions and Evaluation methods Specific approaches: –Naïve Bayes, Decision Lists, Dictionary-based, Selectional Restrictions

Discourse Structure and Coherence Topic segmentation –Useful Features –Hearst’s TexTiling – how does it work? –Supervised methods – how do we evaluate? Coherence relations –Hobbs’ –Rhetorical Structure Theory – what are it’s problems?

Reference Terminology Referring expressions Discourse referents Anaphora and cataphora Coreference Antecendents Pronouns One-anaphora Definite and indefinite NPs Anaphoric chains

Constraints on Anaphoric Reference Salience Recency of mention: rule of 2 sentences Discourse structure Agreement Grammatical function Repeated mention Parallel construction Verb semantics/thematic roles Pragmatics

Algorithms for Coreference Resolution Lappin & Leas Hobbes Centering Theory Supervised approaches Evaluation

Information Extraction Template-based IE –Named Entity Tagging –Sequence-based relation tagging: supervised and bootstrapping –IE for Question Answering, e.g. biographical information (Biadsy’s `bouncing’ between Wikipedia and Google)

Information Retrieval Vector-Space model –Cosine similarity –TF/IDF weighting NIST competition retrieval tasks Techniques for improvement Metrics –Precision, recall, F-measure

Summarization Types and approaches to summarization –Indicative vs. informative –Generative vs. extractive –Single vs. multi-document –Generic vs. user-focused Useful features Evaluation methods Newsblaster – how does it work? –Multi-document –Sentence fusion and ordering –Topic tracking

MT Multilingual challenges –Orthography, Lexical ambiguity, morphology, syntax MT Approaches: –The Pyramid –Statistical vs. Rule-based vs. Hybrid Evaluation metrics –Human vs. Bleu score –Criteria: fluency vs. accuracy

Dialogue Turns and Turn-taking Speech Acts and Dialogue Acts Grounding Intentional Structure: Centering Pragmatics –Presupposition –Conventional Implicature –Conversational Implicature

The Final Dec. 16, MUDD 535, 1:10-4pm Good luck!

Download ppt "CS 4705 Final Review CS4705 Julia Hirschberg. Format and Coverage Covers only material from thru (i.e. beginning with Probabilistic Parsing) Same format."

Similar presentations