Parsing Unrestricted Text

Slides:



Advertisements
Similar presentations
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Advertisements

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Dependency Parsing Joakim Nivre. Dependency Grammar Old tradition in descriptive grammar Modern theroretical developments: –Structural syntax (Tesnière)
Universität des Saarlandes Seminar: Recent Advances in Parsing Technology Winter Semester Jesús Calvillo.
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 :
1 Statistical NLP: Lecture 12 Probabilistic Context Free Grammars.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
1/13 Parsing III Probabilistic Parsing and Conclusions.
Grammar induction by Bayesian model averaging Guy Lebanon LARG meeting May 2001 Based on Andreas Stolcke’s thesis UC Berkeley 1994.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Växjö University Joakim Nivre Växjö University. 2 Who? Växjö University (800) School of Mathematics and Systems Engineering (120) Computer Science division.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Treebanks as Training Data for Parsers Joakim Nivre Växjö University and Uppsala University
Syntactic Pattern Recognition Statistical PR:Find a feature vector x Train a system using a set of labeled patterns Classify unknown patterns Ignores relational.
Relating Two Formal Models of Path-Vector Routing March 15, 2005: IEEE INFOCOM, Miami, Florida Aaron D. Jaggard Tulane University Vijay.
Maximum Entropy Model & Generalized Iterative Scaling Arindam Bose CS 621 – Artificial Intelligence 27 th August, 2007.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Natural Language Understanding
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Spring /22/071 Beyond PCFGs Chris Brew Ohio State University.
Probabilistic Context Free Grammars for Representing Action Song Mao November 14, 2000.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 Semi-Supervised Approaches for Learning to Parse Natural Languages Rebecca Hwa
CS 3240: Languages and Computation Context-Free Languages.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Coarse-to-Fine Efficient Viterbi Parsing Nathan Bodenstab OGI RPE Presentation May 8, 2006.
1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.
Daisy Arias Math 382/Lab November 16, 2010 Fall 2010.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Statistical Decision-Tree Models for Parsing NLP lab, POSTECH 김 지 협.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Supertagging CMSC Natural Language Processing January 31, 2006.
Data Mining and Decision Support
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Introduction to Parsing
Natural Language Processing Vasile Rus
P & NP.
Chapter 7. Classification and Prediction
PRESENTED BY: PEAR A BHUIYAN
Introduction to Parsing (adapted from CS 164 at Berkeley)
Textbook:Modern Compiler Design
Probabilistic and Lexicalized Parsing
Machine Learning in Natural Language Processing
Logical architecture refinement
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
Parsing and More Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parameter control Chapter 8.
BNF 9-Apr-19.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parameter control Chapter 8.
COMPILER CONSTRUCTION
Parameter control Chapter 8.
Presentation transcript:

Parsing Unrestricted Text Joakim Nivre

Two Notions of Parsing Grammar parsing: Text parsing: Given a grammar G and an input string x  *, derive some or all of the analyses y assigned to x by G. Text parsing: Given a text T = (x1, …, xn), derive the correct analysis yi for every sentence xi  T.

Grammar Parsing Properties of grammar parsing: Abstract problem: Mapping from (G, x) to y. Parsing implies recognition; analyses defined only if x  L(G). Correctness (consistency and completeness) can be proven without considering any input string x.

Text Parsing Properties of text parsing: Not a well-defined abstract problem (the text language is not a formal language). Parsing does not imply recognition (recognition presupposes a formal language). Empirical approximation problem. Correctness can only be established with reference to empirical samples of the text language (statistical inference).

Two Methods for Text Parsing Grammar-driven text parsing: Text parsing approximated by grammar parsing. Data-driven text parsing: Text parsing approximated by statistical inference. Not mutually exclusive methods: Grammars can be combined with statistical inference (e.g. PCFG).

Grammar-Driven Text Parsing Basic assumption: The text language L can be approximated by L(G). Potential problems (evaluation criteria): Robustness Disambiguation Accuracy Efficiency

Robustness Basic issue: Two cases: Techniques: What happens if x  L(G)? Two cases: x  L(G), x  L (coverage) x  L(G), x  L (robustness) Techniques: Constraint relaxation Partial parsing

Disambiguation Basic issue: Two cases: Techniques: What happens when G assigns more than one analysis y to a sentence x? Two cases: String ambiguity (real) (disambiguation) Grammar ambiguity (spurious) (leakage) Techniques: Grammar specialization Deterministic parsing Eliminative parsing Data-driven parsing (e.g. PCFG)

Accuracy Basic issue: Grammar-driven techniques: How often can the parser deliver a single correct analysis? Grammar-driven techniques: Linguistically adequate analyses? Adequacy undermined by techniques to handle robustness and disambiguation.

Efficiency Theoretical complexity: Many linguistically motivated formalisms have intractable parsing problems. Even polynomially parsable formalims often have high complexity. Practical efficiency is also affected by: Grammar constants Techniques for handling robustness and disambiguation

Data-Driven Text Parsing Basic assumption: The text language L can be approximated by statistical inference from text samples. Components: A formal model M defining permissible representations for sentences in L A sample of text Tt = (x1, …, xn) from L, with or without the correct analyses At = (y1, …, yn) An inductive inference scheme I defining actual analyses for the sentences of any text T = (x1,…,xn) in L, relative to M and Tt (and possibly At)

Robustness Basic issue: Radical constraint relaxation: Example (DOP3): Is M a grammar or not (cf. PCFG)? Radical constraint relaxation: Ensure that every string has at least one analysis. Example (DOP3): M permits any parse tree composed from subtrees in Tt, with free insertion of (even unseen) words from x. Tt is annotated with context-free parse trees. I defines the probability P(x, y) to be the sum of the probabilities of each derivation of y for x (for any x, y).

Disambiguation Basic issue: Structure of I: Example: PCFG How rank different analyses yi of x? Structure of I: A parameterized stochastic model M, assigning a score S(x, yi) to each permissible analysis yi of x, relative to a set of parameters . A parsing method, i.e. a method for computing the best yi according to S(x, yi) (given ). A learning method, i.e. a method for instantiating  based on inductive inference from Tt. Example: PCFG

Accuracy Basic issue: Data-driven techniques: How often can the parser deliver a single correct analysis? Data-driven techniques: Empirically adequate ranking of alternatives? Accuracy undermined by combinatorial explosion due to radical constraint relaxation.

Efficiency Theoretical complexity: Many data-driven models have intractable inference problems. Even polynomially parsable models often have high complexity. Practical efficiency is also affected by: Model constants Techniques for handling robustness and disambiguation

Converging Approaches? Text parsing: Complex optimization problem Two optimization strategies: Start with good accuracy, improve robustness and disambiguation (while controlling efficiency). Start with good disambiguation (and robustness), improve accuracy (while controlling efficiency). Strategies converging on the same solution? Constraint relaxation for robustness Data-driven models for disambiguation Heuristic search techniques for efficiency