Unsupervised Semantic Parsing

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1 A B C
Scenario: EOT/EOT-R/COT Resident admitted March 10th Admitted for PT and OT following knee replacement for patient with CHF, COPD, shortness of breath.
Simplifications of Context-Free Grammars
Variations of the Turing Machine
1 DSLs with Groovy Saager Mhatre. 2 github.com/dexterous code.google.com/u/saager.mhatre
Online Max-Margin Weight Learning with Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
AP STUDY SESSION 2.
1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
1 Unsupervised Ontology Induction From Text Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)
Copyright © 2013 Elsevier Inc. All rights reserved.
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
David Burdett May 11, 2004 Package Binding for WS CDL.
Introduction to Algorithms 6.046J/18.401J
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Wants.
Programming Language Concepts
Chapter 7 Sampling and Sampling Distributions
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Knowledge Extraction from Technical Documents Knowledge Extraction from Technical Documents *With first class-support for Feature Modeling Rehan Rauf,
Biostatistics Unit 5 Samples Needs to be completed. 12/24/13.
Break Time Remaining 10:00.
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
Database Performance Tuning and Query Optimization
PP Test Review Sections 6-1 to 6-6
1 Atomic Routing Games on Maximum Congestion Costas Busch Department of Computer Science Louisiana State University Collaborators: Rajgopal Kannan, LSU.
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
TESOL International Convention Presentation- ESL Instruction: Developing Your Skills to Become a Master Conductor by Beth Clifton Crumpler by.
Computer vision: models, learning and inference
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.5 Dividing Polynomials Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
1 Using Bayesian Network for combining classifiers Leonardo Nogueira Matos Departamento de Computação Universidade Federal de Sergipe.
: 3 00.
5 minutes.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Types of selection structures
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
1 Titre de la diapositive SDMO Industries – Training Département MICS KERYS 09- MICS KERYS – WEBSITE.
Essential Cell Biology
Converting a Fraction to %
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Clock will move after 1 minute
PSSA Preparation.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
1.step PMIT start + initial project data input Concept Concept.
9. Two Functions of Two Random Variables
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.
Markov Logic and Deep Networks Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Presentation transcript:

Unsupervised Semantic Parsing Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)

Outline Motivation Unsupervised semantic parsing Learning and inference Experimental results Conclusion

Semantic Parsing Natural language text  Formal and detailed meaning representation (MR) Also called logical form Standard MR language: First-order logic E.g., Microsoft buys Powerset.

Semantic Parsing Natural language text  Formal and detailed meaning representation (MR) Also called logical form Standard MR language: First-order logic E.g., Microsoft buys Powerset. BUYS(MICROSOFT,POWERSET)

Shallow Semantic Processing Semantic role labeling Given a relation, identify arguments E.g., agent, theme, instrument Information extraction Identify fillers for a fixed relational template E.g., seminar (speaker, location, time) In contrast, semantic parsing is Formal: Supports reasoning and decision making Detailed: Obtains far more information

Applications Natural language interfaces Knowledge extraction from Wikipedia: 2 million articles PubMed: 18 million biomedical abstracts Web: Unlimited amount of information Machine reading: Learning by reading Question answering Help solve AI

Traditional Approaches Manually construct a grammar Challenge: Same meaning can be expressed in many different ways Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … …… Manual encoding of variations?

Supervised Learning User provides: Target predicates and objects Example sentences with meaning annotation System learns grammar and produces parser Examples: Zelle & Mooney [1993] Zettlemoyer & Collins [2005, 2007, 2009] Wong & Mooney [2007] Lu et al. [2008] Ge & Mooney [2009]

Limitations of Supervised Approaches Applicable to restricted domains only For general text Not clear what predicates and objects to use Hard to produce consistent meaning annotation Crucial to develop unsupervised methods Also, often learn both syntax and semantics Fail to leverage advanced syntactic parsers Make semantic parsing harder

Unsupervised Approaches For shallow semantic tasks, e.g.: Open IE: TextRunner [Banko et al. 2007] Paraphrases: DIRT [Lin & Pantel 2001] Semantic networks: SNE [Kok & Domingos 2008] Show promise of unsupervised methods But … none for semantic parsing

correct answers as second best This Talk: USP First unsupervised approach for semantic parsing Based on Markov Logic [Richardson & Domingos, 2006] Sole input is dependency trees Can be used in general domains Applied it to extract knowledge from biomedical abstracts and answer questions Substantially outperforms TextRunner, DIRT Three times as many correct answers as second best Triple the number of correct answers compared to second best

Outline Motivation Unsupervised semantic parsing Learning and inference Experimental results Conclusion

USP: Key Idea # 1 Target predicates and objects can be learned Viewed as clusters of syntactic or lexical variations of the same meaning BUYS(-,-)  buys, acquires, ’s purchase of, …  Cluster of various expressions for acquisition MICROSOFT  Microsoft, the Redmond software giant, …  Cluster of various mentions of Microsoft

USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster arbitrary expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, …

USP: Key Idea # 2 Cluster same forms at the atom level Relational clustering  Cluster relations with same objects USP  Recursively cluster expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … Cluster same forms at the atom level

Cluster forms in composition with same forms USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … Cluster forms in composition with same forms

Cluster forms in composition with same forms USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … Cluster forms in composition with same forms

Cluster forms in composition with same forms USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … Cluster forms in composition with same forms

USP: Key Idea # 3 Start directly from syntactic analyses Focus on translating them to semantics Leverage rapid progress in syntactic parsing Much easier than learning both

USP: System Overview Input: Dependency trees for sentences Converts dependency trees into quasi-logical forms (QLFs) QLF subformulas have natural lambda forms Starts with lambda-form clusters at atom level Recursively builds up clusters of larger forms Output: Probability distribution over lambda-form clusters and their composition MAP semantic parses of sentences TODO: add pointers: guided by a joint model; cluster ~ target relation/obj

Probabilistic Model for USP Joint probability distribution over a set of QLFs and their semantic parses Use Markov logic A Markov Logic Network (MLN) is a set of pairs (Fi, wi) where Fi is a formula in first-order logic wi is a real number Number of true groundings of Fi ML is one of the most powerful langauge for combining rel inf w. uncertainty reasoning. A MLN consists of FO formulas w. weight. From the logical AI point of view, ML allows us to soften constraints so that a world that violates them becomes less likely, but not impossible. From the point of view of prob inf, ML together w. constants form a MN, which defines a prb dist as this log-linear mdl. For each gnd clause, we have a feature fi w. the corresponding wt wi.

Generating Quasi-Logical Forms buys nsubj dobj Microsoft Powerset Convert each node into an unary atom

Generating Quasi-Logical Forms buys(n1) nsubj dobj Microsoft(n2) Powerset(n3) n1, n2, n3 are Skolem constants

Generating Quasi-Logical Forms buys(n1) nsubj dobj Microsoft(n2) Powerset(n3) Convert each edge into a binary atom

Generating Quasi-Logical Forms buys(n1) nsubj(n1,n2) dobj(n1,n3) Microsoft(n2) Powerset(n3) Convert each edge into a binary atom

Partition QLF into subformulas A Semantic Parse buys(n1) nsubj(n1,n2) dobj(n1,n3) Microsoft(n2) Powerset(n3) Subformulas can be anything Partition QLF into subformulas

Subformula  Lambda form: A Semantic Parse buys(n1) nsubj(n1,n2) dobj(n1,n3) Microsoft(n2) Powerset(n3) Subformula  Lambda form: Replace Skolem constant not in unary atom with a unique lambda variable

Subformula  Lambda form: A Semantic Parse buys(n1) λx2.nsubj(n1,x2) λx3.dobj(n1,x3) Microsoft(n2) Powerset(n3) Lambda form is a notational extension of FOL and the standard way of representing meaning Subformula  Lambda form: Replace Skolem constant not in unary atom with a unique lambda variable

Follow Davidsonian Semantics A Semantic Parse Core form buys(n1) Argument form Argument form λx2.nsubj(n1,x2) λx3.dobj(n1,x3) Microsoft(n2) Powerset(n3) Follow Davidsonian Semantics Core form: No lambda variable Argument form: One lambda variable

Assign subformula to lambda-form cluster A Semantic Parse buys(n1)  CBUYS λx2.nsubj(n1,x2) λx3.dobj(n1,x3)  CMICROSOFT Microsoft(n2)  CPOWERSET Powerset(n3) Assign subformula to lambda-form cluster

Learn weights for each pair of Distribution over core forms Lambda-Form Cluster buys(n1) 0.1 One formula in MLN Learn weights for each pair of cluster and core form CBUYS acquires(n1) 0.2 …… Distribution over core forms

May contain variable number of Lambda-Form Cluster ABUYER buys(n1) 0.1 CBUYS acquires(n1) 0.2 ABOUGHT …… APRICE …… May contain variable number of argument types

argument forms, clusters, and number Argument Type: ABUYER 0.5 CMICROSOFT 0.2 None 0.1 λx2.nsubj(n1,x2) Three MLN formulas 0.4 CGOOGLE 0.1 One 0.8 λx2.agent(n1,x2) …… …… …… Distributions over argument forms, clusters, and number

USP MLN Four simple formulas Exponential prior on number of parameters

Final logical form is obtained Abstract Lambda Form buys(n1) λx2.nsubj(n1,x2) λx3.dobj(n1,x3) Final logical form is obtained via lambda reduction CBUYS(n1) λx2.ABUYER(n1,x2) λx3.ABOUGHT(n1,x3)

Outline Motivation Unsupervised semantic parsing Learning and inference Experimental results Conclusion

Learning Observed: Q (QLFs) Hidden: S (semantic parses) Maximizes log-likelihood of observing the QLFs

Use Greedy Search Search for Θ, S to maximize PΘ(Q, S) Same objective as hard EM Directly optimize it rather than lower bound For fixed S, derive optimal Θ in closed form Guaranteed to find a local optimum

Search Operators MERGE(C1, C2): Merge clusters C1, C2 E.g.: buys, acquires  buys, acquires COMPOSE(C1, C2): Create a new cluster resulting from composing lambda forms in C1, C2 E.g.: Microsoft, Corporation  Microsoft Corporation

USP-Learn Initialization: Partition  Atoms Greedy step: Evaluate search operations and execute the one with highest gain in log-likelihood Efficient implementation: Inverted index, etc.

MAP Semantic Parse Goal: Given QLF Q and learned Θ, find semantic parse S to maximize PΘ(Q, S) Again, use greedy search

Outline Motivation Unsupervised semantic parsing Learning and inference Experimental results Conclusion

Task No predefined gold logical forms Evaluate on an end task: Question answering Applied USP to extract knowledge from text and answer questions Evaluation: Number of answers and accuracy

Dataset GENIA dataset: 1999 Pubmed abstracts Questions Use simple questions in this paper, e.g.: What does anti-STAT1 inhibit? What regulates MIP-1 alpha? Sample 2000 questions according to frequency

Systems Closest match in aim and capability: TextRunner [Banko et al. 2007] Also compared with: Baseline by keyword matching and syntax RESOLVER [Yates and Etzioni 2009] DIRT [Lin and Pantel 2001]

Total Number of Answers KW-SYN TextRunner RESOLVER DIRT USP

Number of Correct Answers KW-SYN TextRunner RESOLVER DIRT USP

Number of Correct Answers Three times as many correct answers as second best KW-SYN TextRunner RESOLVER DIRT USP

Number of Correct Answers Highest accuracy: 88% KW-SYN TextRunner RESOLVER DIRT USP

Qualitative Analysis USP resolves many nontrivial variations Argument forms that mean the same, e.g., expression of X  X expression X stimulates Y  Y is stimulated with X Active vs. passive voices Synonymous expressions Etc.

Clusters And Compositions Clusters in core forms  investigate, examine, evaluate, analyze, study, assay   diminish, reduce, decrease, attenuate   synthesis, production, secretion, release   dramatically, substantially, significantly  …… Compositions amino acid, t cell, immune response, transcription factor, initiation site, binding site …

Question-Answer: Example Q: What does IL-13 enhance? A: The 12-lipoxygenase activity of murine macrophages Sentence: The data presented here indicate that (1) the 12-lipoxygenase activity of murine macrophages is upregulated in vitro and in vivo by IL-4 and/or IL-13, (2) this upregulation requires expression of the transcription factor STAT6, and (3) the constitutive expression of the enzyme appears to be STAT6 independent.

Future Work Learn subsumption hierarchy over meanings Incorporate more NLP into USP Scale up learning and inference Apply to larger corpora (e.g., entire PubMed) Currently, IsPart hierarchy of clusters, also need an Isa hierarchy

Conclusion USP: The first approach for unsupervised semantic parsing Based on Markov Logic Learn target logical forms by recursively clustering variations of same meaning Novel form of relational clustering Applicable to general domains Substantially outperforms shallow methods