807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction.

807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction

OTHER ASPECTS OF SEMANTIC INTERPRETATION Identification of RELATIONS between entities mentioned – Focus of interest in modern CL since 1993 or so Identification of TEMPORAL RELATIONS – From about 2003 on QUALIFICATION of such relations (modality, epistemicity) – From about 2010 on

TYPES OF RELATIONS Predicate-argument structure (verbs and nouns) – John kicked the ball Nominal relations – The red ball Relations between events / temporal relations – John kicked the ball and scored a goal Domain-dependent relations (MUC/ACE) – John works for IBM

PREDICATE/ARGUMENT STRUCTURE Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji ) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting... When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)

PREDICATE-ARGUMENT STRUCTURE Linguistic Theories – Case Frames – Fillmore  FrameNet – Lexical Conceptual Structure – Jackendoff  LCS – Proto-Roles – Dowty  PropBank – English verb classes (diathesis alternations) - Levin  VerbNet – Talmy, Levin and Rappaport

Fillmore’s Case Theory Sentences have a DEEP STRUCTURE with CASE RELATIONS A sentence is a verb + one or more NPs – Each NP has a deep-structure case A(gentive) I(nstrumental) D(ative) F(actitive) L(ocative) O(bjective) – Subject is no more important than Object Subject/Object are surface structure

THEMATIC ROLES Following on Fillmore’s original work, many theories of predicate argument structure / thematic roles were proposed, among which the best known perhaps – Jackendoff’s LEXICAL CONCEPTUAL SEMANTICS – Dowty’s PROTO-ROLES theory

Dowty’s PROTO-ROLES Event-dependent Prototypes based on shared entailments Grammatical relations such as subject related to observed (empirical) classification of participants Typology of grammatical relations Proto-Agent Proto-Patient

Proto-Agent Properties – Volitional involvement in event or state – Sentience (and/or perception) – Causing an event or change of state in another participant – Movement (relative to position of another participant) – (exists independently of event named) *may be discourse pragmatic

Proto-Patient Properties: – Undergoes change of state – Incremental theme – Causally affected by another participant – Stationary relative to movement of another participant – (does not exist independently of the event, or at all) *may be discourse pragmatic

Semantic role labels: Jan broke the LCD projector. break (agent(Jan), patient(LCD-projector)) cause(agent(Jan), change-of-state(LCD-projector)) (broken(LCD-projector)) agent(A) -> intentional(A), sentient(A), causer(A), affector(A) patient(P) -> affected(P), change(P),… Filmore, 68 Jackendoff, 72 Dowty, 91

VERBNET AND PROPBANK Dowty’s theory of proto-roles was the basis for the development of PROPBANK, the first corpus annotated with information about predicate-argument structure

PROPBANK REPRESENTATION a GM-Jaguar pact that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 give(GM-J pact, US car maker, 30% stake) a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

ARGUMENTS IN PROPBANK Arg0 = agent Arg1 = direct object / theme / patient Arg2 = indirect object / benefactive / instrument / attribute / end state Arg3 = start point / benefactive / instrument / attribute Arg4 = end point Per word vs frame level – more general?

FROM PREDICATES TO FRAMES In one of its senses, the verb observe evokes a frame called Compliance: this frame concerns people’s responses to norms, rules or practices. The following sentences illustrate the use of the verb in the intended sense: – Our family observes the Jewish dietary laws. – You have to observe the rules or you’ll be penalized. – How do you observe Easter? – Please observe the illuminated signs.

FrameNet FrameNet records information about English words in the general vocabulary in terms of 1.the frames (e.g. Compliance) that they evoke, 2.the frame elements (semantic roles) that make up the components of the frames (in Compliance, Norm is one such frame element), and 3.each word’s valence possibilities, the ways in which information about the frames is provided in the linguistic structures connected to them (with observe, Norm is typically the direct object). theta

NOMINAL RELATIONS

HISTORY

CLASSIFICATION SCHEMES FOR NOMINAL RELATIONS

ONE EXAMPLE (Barker et al1998, Nastase & Spakowicz 2003)

THE TWO-LEVEL TAXONOMY OF RELATIONS, 2

THE SEMEVAL-2007 CLASSIFICATION OF RELATIONS Cause-Effect: laugh wrinkles Instrument-Agency: laser printer Product-Producer: honey bee Origin-Entity: message from outer-space Theme-Tool: news conference Part-Whole: car door Content-Container: the air in the jar

CAUSAL RELATIONS

TEMPORAL RELATIONS

THE MUC AND ACE TASKS Modern research in relation extraction, as well, was kicked-off by the Message Understanding Conference (MUC) campaigns and continued through the Automatic Content Extraction (ACE) and Machine Reading follow- ups MUC: NE, coreference, TEMPLATE FILLING ACE: NE, coreference, relations

TEMPLATE-FILLING

EXAMPLE MUC: JOB POSTING

THE ASSOCIATED TEMPLATE

AUTOMATIC CONTENT EXTRACTION (ACE)

ACE: THE DATA

ACE: THE TASKS

RELATION DETECTION AND RECOGNITION

ACE: RELATION TYPES

OTHER PRACTICAL VERSIONS OF RELATION EXTRACTION Biomedical domain (BIONLP, BioCreative) Chemistry Cultural Heritage

THE TASK OF SEMANTIC RELATION EXTRACTION

SEMANTIC RELATION EXTRACTION: THE CHALLENGES

HISTORY OF RELATION EXTRACTION Before 1993: Symbolic methods (using knowledge bases) Since then: statistical / heuristic based methods – From 1995 to around 2005: mostly SUPERVISED – More recently: also quite a lot of UNSUPERVISED / SEMI SUPERVISED techniques

SUPERVISED RE: RE AS A CLASSIFICATION TASK Binary relations Entities already manually/automatically recognized Examples are generated for all sentences with at least 2 entities Number of examples generated per sentence is NC2 – Combination of N distinct entities selected 2 at a time

GENERATING CANDIDATES TO CLASSIFY

RE AS A BINARY CLASSIFICATION TASK

NUMBER OF CANDIDATES TO CLASSIFY – SIMPLE MINDED VERSION

THE SUPERVISED APPROACH TO RE Most current approaches to RE are kernel- based Different information is used – Sequences of words, e.g., through the GLOBAL CONTEXT / LOCAL CONTEXT kernels of Bunescu and Mooney / Giuliano Lavelli & Romano – Syntactic information through the TREE KERNELS of Zelenko et al / Moschitti et al – Semantic information in recent work

KERNEL METHODS: A REMINDER Embedding the input data in a feature space Using a linear algorithm for discovering non-linear patterns Coordinates of images are not needed, only pairwise inner products Pairwise inner products can be efficiently computed directly from X using a kernel function K:X×X→R

MODULARITY OF KERNEL METHODS

THE WORD-SEQUENCE APPROACH Shallow linguistic Information: – tokenization – Lemmatization – sentence splitting – PoS tagging Claudio Giuliano, Alberto Lavelli, and Lorenza Romano (2007), FBK-IRST: Kernel methods for relation extraction, Proc. Of SEMEVAL-2007

LINGUISTIC REALIZATION OF RELATIONS Bunescu & Mooney, NIPS 2005

WORD-SEQUENCE KERNELS Two families of “basic” kernels – Global Context – Local Context Linear combination of kernels Explicit computation – Extremely sparse input representation

THE GLOBAL CONTEXT KERNEL

THE LOCAL CONTEXT KERNEL

LOCAL CONTEXT KERNEL (2)

KERNEL COMBINATION

EXPERIMENTAL RESULTS Biomedical data sets – AIMed – LLL Newspaper articles – Roth and Yih SEMEVAL 2007

EVALUATION METHODOLOGIES

EVALUATION (2)

EVALUATION (3)

EVALUATION (4)

RESULTS ON AIMED

NON-SUPERVISED METHODS FOR RELATION EXTRACTION Unsupervised relation extraction: – Hearst – Other work on extracting hyponymy relations – Extracting other relations: Almuhareb and Poesio, Cimiano and Wenderoth Semi-supervised methods – KNOW-IT-ALL

HEARST 1992, 1998: USING PATTERNS TO EXTRACT ISA LINKS Intuition: certain constructions typically used to express certain types of semantic relations E.g., for ISA: – The seabass IS A fish – Swimming, running AND OTHER activities – Vehicles such as cars, trucks and bikes

TEXT PATTERNS FOR HYPONYMY EXTRACTION HEARST 1998: NP {, NP}* {,} or other NP bruises …… broken bones, and other INJURIES HYPONYM (bruise, injury) EVALUATION: 55.46% precision wrt WordNet

THE PRECISION / RECALL TRADEOFF X and other Y: high precision, low recall X isa Y: low precision, high recall

HEARST’ REQUIREMENTS ON PATTERNS

OTHER WORK ON EXTRACTING HYPONYMY Caraballo ACL 1999 Widdows & Dorow 2002 Pantel & Ravichandran ACL 2004

OTHER APPROACHES TO RE Using syntactic information Using lexical features

Syntactic information for RE Pros: – more structured information useful when dealing with long-distance relations Cons: – not always robust – (and not available for all languages)

Semi-supervised methods Hearst 1992: find new patterns by using initial examples as SEEDS This approach has been pursued in a number of ways – Espresso (Pantel and Pennacchiotti 2006) – OPEN INFORMATION EXTRACTION (Etzioni and colleagues)

THE GENERIC SEMI-SUPERVISED ALGORITHM 1.Start with SEED INSTANCES Depending on algorithm, seed may be hand-generated or automatically obtained 2.For each seed instance, extract patterns from corpus Choice of patterns depends on algorithm 3.Output the best patterns according to some metric 4.(Possibly) iterate steps 2-3

THE ESPRESSO SEMI-SUPERVISED ALGORITHM 1.Start with SEED INSTANCES Hand-chosen 2.For each seed instance, extract patterns from corpus Generalization of whole sentence 3.Output the best patterns according to some metric A metric based on PMI 3.Do iterate steps 2-3

KNOW-IT-ALL A system for ontology population developed by Oren Etzioni and collaborators at the University of Washington

KNOW-IT-ALL: ARCHITECTURE

BOOTSTRAPPING This first step takes the input domain predicates and the generic extraction patterns and produces domain-specific extraction patterns

EXTRACTION PATTERNS

EXTRACTOR Uses domain-specific extraction patterns + syntactic constraints – In “Garth Brooks is a country singer”, country NOT extracted as an instance of the pattern “X is a NP” Produces EXTRACTIONS (= instances of the patterns that satisfy the syntactic constraints)

ASSESSOR Estimates the likelihood of an extraction using POINTWISE MUTUAL INFORMATION between the extracted INSTANCE and DISCRIMINATOR phrases E.g., INSTANCE: Liege DISCRIMINATOR PHRASES: “is a city”

ESTIMATING THE LIKELIHOOD OF A FACT P(f |  ) and P(f |  ) estimated using a set of positive and negative instances

TERMINATION CONDITION KNOW-IT-ALL could continue searching for instances – But for instance, COUNTRY has only around 300 instances Stop: Signal-to-Noise ratio – Number of high probability facts / Number of low probability ones

OVERALL ALGORITHM

EVALUATION 5 classes: CITY, US STATE, COUNTRY, ACTOR, FILM

IE IN PRACTICE: THE GOOGLE KNOWLEDGE GRAPH “A huge knowledge graph of interconnected entities and their attributes”. Amit Singhal, Senior Vice President at Google “A knowledge based used by Google to enhance its search engine’s results with semantic-search information gathered from a wide variety of sources” http://en.wikipedia.org/wiki/Knowledge_Graph 82

Based on information derived from many sources including Freebase, CIA World Factbook, Wikipedia Contains 570 million objects and more than 18 billion facts about and relationships between these different objects 83 INFORMATION IN THE GKG

84 Search for a person, place, or thing Facts about entities are displayed in a knowledge box on the right side INFORMATION IN THE GKG

What it looks like Web Results have not changed

What it looks like This is what’s new Map General info Upcoming Events Points of interest *The Type of information that appears in this panel depends on what you are searching for

Handling vague searches/homophones Prompt user to indicate more precisely exactly what it is they are looking for Displays results only relating to that meaning Eliminates other results in both the panel and web results

Example of this: Very General Results

Example of this: Shows Possible Results: Now user pick what they were looking for Lets assume user meant the TV show Kings

MORE COMPLEX SEMANTICS Modalities Temporal interpretation

ACKNOWLEDGMENTS Many slides borrowed from – Roxana Girju – Alberto Lavelli

807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction.

Similar presentations

Presentation on theme: "807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction.

Similar presentations

Presentation on theme: "807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction."— Presentation transcript:

Similar presentations

About project

Feedback