Features and Unification

Slides:



Advertisements
Similar presentations
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
Advertisements

Feature Structures and Parsing Unification Grammars Algorithms for NLP 18 November 2014.
BİL711 Natural Language Processing1 Problems with CFGs We know that CFGs cannot handle certain things which are available in natural languages. In particular,
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 :
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Natural Language Processing - Feature Structures - Feature Structures and Unification.
Features & Unification Ling 571 Deep Processing Techniques for NLP January 26, 2011.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Features & Unification Ling 571 Deep Processing Techniques for NLP January 31, 2011.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור עשר Chart Parsing (cont) Features.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
1/13 Parsing III Probabilistic Parsing and Conclusions.
1/17 Probabilistic Parsing … and some other approaches.
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
Ch.11 Features and Unification Ling 538, 2006 Fall Jiyoung Kim.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
CS 4705 Lecture 11 Feature Structures and Unification Parsing.
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
Features and Unification Read J & M Chapter 11.. Solving the Agreement Problem Number agreement: S  NP VP * Mary walk. [NP NUMBER] [VP NUMBER] NP  det.
Context-Free Parsing Part 2 Features and Unification.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
CS 4705 Parsing More Efficiently and Accurately. Review Top-Down vs. Bottom-Up Parsers Left-corner table provides more efficient look- ahead Left recursion.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Chapter 16: Features and Unification Heshaam Faili University of Tehran.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
What it’s ? “parsing” Parsing or syntactic analysis is the process of analysing a string of symbols, either in natural language or in computer languages,
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
7. Parsing in functional unification grammar Han gi-deuc.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Page 1 Probabilistic Parsing and Treebanks L545 Spring 2000.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.
Albert Gatt Corpora and Statistical Methods Lecture 11.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
CSA2050 Introduction to Computational Linguistics Parsing I.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Section 11.3 Features structures in the Grammar ─ Jin Wang.
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
Chapter 11: Parsing with Unification Grammars Heshaam Faili University of Tehran.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Chapter 12: Probabilistic Parsing and Treebanks Heshaam Faili University of Tehran.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Natural Language Processing Vasile Rus
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
Basic Parsing with Context Free Grammars Chapter 13
Dependency Parsing & Feature-based Parsing
Probabilistic and Lexicalized Parsing
Natural Language - General
CPSC 503 Computational Linguistics
CSA2050 Introduction to Computational Linguistics
Presentation transcript:

Features and Unification Grammatical Categories (e.g. Non3sgAux, 3sgNP) and grammar rules (S-> NP VP) can be thought of as objects that have complex set of properties associated with them. These properties are represented as constraints (constraint-based formalisms) Such formalisms are efficient for representing language phenomena such as agreement and subcategorizations that cannot be handled by CFGs in an efficient way.

Features and Unification e.g. a NP may have a property NUMBER and a VP may have a similar property, and agreement is then implemented by comparing these two properties. In that case the grammar rule S-> NP VP is extended with the constraint Only if the NUMBER of NP is equal to the number of VP The formalization of such constraints and of properties such as NUMBER are unification and feature structures.

Feature Structures Feature Structures (FS) is a method for encoding the grammatical properties. They are simply sets of feature-value pairs, where features are unanalyzable atomic symbols and values are either atomic symbols or are feature structures. FSs are usually represented with an attribute-value matrix (AVM) FEATURE_1 VALUE_1 FEATURE_2 VALUE_2 ... FEATURE_N VALUE_N

Feature Structures Feature Structures for categories NP3Sg and NP3Pl CAT NP NUMBER SG PERSON 3 CAT NP NUMBER PL PERSON 3 Some grammatical categories can remain common (e.g CAT and PERSON) and distinctions can be made by changing others (e.g. NUMBER)

Feature Structures The values of feature structures may be other feature structures. CAT NP AGREEMENT NUMBER SG PERSON 3 With such a grouping we can test for the equality of the values NUMBER and PERSON together by testing the equality of the agreement feature.

Feature Structures FSs can also be represented as graphs. A feature path is a list of features through an FS leading to a particular value. E.g. the path <AGREEMENT PERSON> leads to the value 3.

Reentrant Feature Structures It is also possible that two features share the same FS as a value. Such FSs are called reentrant structures. The features actually share the same FS as value (not just equal values) CAT S HEAD AGREEMENT (1) SUBJECT NUMBER SG PERSON 3 [ AGREEMENT (1)]

Reentrant Feature Structures

Unification of Feature Structures Unification is an operation that Merges the information of two structures Rejects the merging of incompatible structures Simple Unification [NUMBER SG] |_| [NUMBER SG] = [NUMBER SG] [NUMBER SG] |_| [NUMBER PL] Fails! [NUMBER SG] |_| [NUMBER [ ] ] = [NUMBER SG] where [ ] means unspecified value. [NUMBER SG] |_| [PERSON 3 ] = NUMBER SG PERSON 3

Unification of Feature Structures AGREEMENT (1) SUBJECT NUMBER SG PERSON 3 [ AGREEMENT (1)] NUMBER SG PERSON 3 SUBJECT AGREEMENT AGREEMENT (1) SUBJECT NUMBER SG PERSON 3 [ AGREEMENT (1)]

Unification of Feature Structures AGREEMENT (1) SUBJECT [ AGREEMENT (1)] NUMBER SG PERSON 3 SUBJECT AGREEMENT AGREEMENT (1) NUMBER SG PERSON 3 AGREEMENT (1) SUBJECT

Unification of Feature Structures AGREEMENT NUMBER SG SUBJECT AGREEMENT NUMBER SG NUMBER SG PERSON 3 SUBJECT AGREEMENT AGREEMENT NUMBER SG NUMBER SG PERSON 3 SUBJECT AGREEMENT

Unification of Feature Structures AGREEMENT (1) SUBJECT NUMBER SG PERSON 3 [ AGREEMENT (1)] AGREEMENT NUMBER SG PERSON 3 NUMBER PL PERSON 3 SUBJECT AGREEMENT Failure!

Subsuming Unification is a way of merging the information of two FSs. The unified structure is equally or more specific (has more information) to any of the input FSs. We say that a less specific feature subsumes an equally or more specific one (operator ⊑). Formally: A feature structure F subsumes a feature structure G (F ⊑ G) if and only if: For every feature x in F, F(x) ⊑ G(x) For all paths p and q in F such that F(p)=F(q), it is also the case that G(p)=G(q)

Subsuming ⊑ ⊑ ⊑ CAT VP AGREEMENT (1) AGREEMENT (1) SUBJECT CAT VP NUMBER SG PERSON 3 AGREEMENT SUBJECT CAT VP AGREEMENT (1) ⊑ NUMBER SG PERSON 3 AGREEMENT (1) SUBJECT

Unification Formally unification is defined as the most general feature structure H such that F ⊑ H, G ⊑ H. The unification operation is monotonic. This means that if a feature structure satisfies some description, unifying with another FS results in a new FS that still satisfies the original description (i.e. all of the original information is retained). A direct consequence of the above is that unification is order-independent. Regardless of the order in which we unify a number of FSs the final result will be the same.

Feature Structures in the Grammar FSs and Unification provide an elegant way for expressing syntactic constraints. This is done by augmenting CFG rules with FS for the constituents of the rules and unification operations that impose constraints on those constituents. Rules: β0 -> β1 β2 .... βΝ Constraints: < βi feature path > = Atomic Value < βi feature path > = < βj feature path > e.g. S -> NP VP < NP NUMBER > = < VP NUMBER >

Agreement Subject-Verb Agreement This flight serves breakfast. S -> NP VP <NP AGREEMENT> = <VP AGREEMENT> Does this flight serve breakfast. Do these flights serve breakfast. S -> Aux NP VP <Aux AGREEMENT> = <NP AGREEMENT> Determiner-Noun Agreement This flight, these flights NP -> Det Nominal <Det AGREEMENT> = <Nominal AGREEMENT> <NP AGREEMENT> = <Nominal AGREEMENT>

Agreement Aux –> do <Aux AGREEMENT NUMBER>=PL <Aux AGREEMENT PERSON>=3 Aux -> does <Aux AGREEMENT NUMBER>=SG Verb -> serve <Verb AGREEMENT NUMBER>=PL Verb -> serves <Verb AGREEMENT NUMBER>=SG < Verb AGREEMENT PERSON>=3

Head Features Compositional Grammatical Constituents (NP, VP …) have features which are copied from their children. The child that provides the features is called the head of the phrase and the copied features are called head features. VP -> Verb NP <VP AGREEMENT> = <Verb AGREEMENT> NP -> Det Nominal <Det AGREEMENT> = <Nominal AGREEMENT> <NP AGREEMENT> = <Nominal AGREEMENT> Or a this can be generalized by adding a HEAD feature: <VP HEAD> = <Verb HEAD> <Det HEAD> = <Nominal HEAD> <NP HEAD> = <Nominal HEAD>

Subcategorization Subcategorization is the notion that different verbs take different patterns of arguments. By associating each verb with a SUBCAT feature we can model this behaviour. Verb -> serves <Verb HEAD ARGUMENT NUMBER> = SG <Verb HEAD SUBCAT> = TRANS VP -> Verb <VP HEAD>= < Verb HEAD>, <VP HEAD SUBCAT>=INTRANS VP -> Verb NP <VP HEAD>= < Verb HEAD>, <VP HEAD SUBCAT>=TRANS VP -> Verb NP NP <VP HEAD>= < Verb HEAD>, <VP HEAD SUBCAT>=DITRANS

Subcategorization Another approach is to allow each verb to explicitly specify its arguments as a list. Verb -> serves <Verb HEAD AGREEMENT NUMBER>=SG <Verb HEAD SUBCAT FIRST CAT>=NP <Verb HEAD SUBCAT SECOND>=END Verb -> want <Verb HEAD SUBCAT FIRST CAT>=VP <Verb HEAD SUBCAT FIRST FORM>=INFINITIVE VP -> Verb NP <VP HEAD>=<Verb HEAD> <VP HEAD SUBCAT FIRST CAT>=<NP CAT>

Implementing Unification The FS of the input can be represented as directed acyclic graphs (DAG), where features are labels or directed arcs and feature values are atomic symbols or DAGs). The implementation of unification is then a recursive graph matching algorithm, that loops through the features in one input and tries to find a corresponding feature in the other. If a single feature causes a mismatch then the algorithm fails. The algorithm proceeds recursively, so as to deal with with features that have other FSs as values.

Parsing with Unification Since Unification is order independent it is possible to ignore the search strategy used in the parser. Therefore unification can be added to any of the parsers we have studied (Top-down, bottom-up, Early). A simple approach is to parse using the CFG and at the end filter out the parses that contain unification failures. A better approach is to incorporate unification constraints in the parsing process and therefore eliminated structures that don’t satisfy unification constraints as soon as they are found.

Unification Parsing A different approach to parsing using unification is to consider the grammatical category as a feature and implement the context-free rule as a unification between CAT features. E.g. X0->X1X2 < X0 CAT>=S, < X1 CAT>=NP, < X2 CAT>=VP < X1 HEAD AGREEMENT>=< X2 HEAD AGREEMENT> < X2 HEAD >= < X0 HEAD > This approach models in an elegant way rules that can be generalized across many different grammatical categories. X0->X1 and X 2 < X1 CAT> = < X2 CAT> < X0 CAT> = < X1 CAT>

Probabilistic Grammars Probabilistic Context-Free Grammars (PCFGs) (or Stochastic Context-Free Grammars are Context-Free Grammars where each rule is augmented with a conditional probability. A -> B [p] PCFGs can be used to estimate a number of useful probabilities concerning the parse trees of a sentence. Such probabilities can be useful for disambiguating different parses of a sentence.

Probabilistic Grammars

Probabilistic Grammars The probability of a parse of a sentence is calculated as the product of all the probabilities of all the rules used to expand each node in the sentence parse. P(Ta)=.15 * .40 * .05 * .35 * .75 * .40 * .40 * .40 * .30 *.40 * .50 = 1.5 * 10-6 P(Tb)=.15 * .40 * .40 * .05 * .05 * .75 * .40 * .40 * .40 * .30 *.40 * .50 = 1.7 * 10-7 Similarly in this way it is also possible to assign probability to a substring of a sentence (probability of a subtree of the parse tree)

Learning PCFG Probabilities PCFG Probabilities can be learned by using a corpus of already-parsed sentences. Such a corpus is called a treebank. An example of such a treebank is the Penn Treebank, that contains parsed sentences of 1 million words from the Brown corpus. Then the probability of a rule is computed by counting the number of times this rule is expanded. P(a->b|a)=Count(a->b) / Count (a) There are also algorithms that calculate such probabilities without using a treebank, such as the Inside-Outside algorithm.

Dependency Grammars Dependency Grammars is a different lexical formalism that is not based on the notion of constituents, but on the lexical dependencies between words. The syntactic structure of a sentence is described purely in terms of words and on binary semantic or syntactic relations between these words. Dependency Grammars are very useful for dealing with languages with free word order, where the word order is far more flexible than in English (e.g. Greek, Czech). In such languages CFGs would require a different set of rules for dealing with each different word order.

Dependency Description subj syntactic subject obj direct object dat indirect object pcomp complement of a preposition comp predicate nominals tmp temporal adverbial loc location adverbial attr premodifying (attributive) nominals mod nominal postmodifiers (prepositional phrases)