Presentation is loading. Please wait.

Presentation is loading. Please wait.

807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction.

Similar presentations


Presentation on theme: "807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction."— Presentation transcript:

1 807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction

2 OTHER ASPECTS OF SEMANTIC INTERPRETATION Identification of RELATIONS between entities mentioned – Focus of interest in modern CL since 1993 or so Identification of TEMPORAL RELATIONS – From about 2003 on QUALIFICATION of such relations (modality, epistemicity) – From about 2010 on

3 TYPES OF RELATIONS Predicate-argument structure (verbs and nouns) – John kicked the ball Nominal relations – The red ball Relations between events / temporal relations – John kicked the ball and scored a goal Domain-dependent relations (MUC/ACE) – John works for IBM

4 TYPES OF RELATIONS Predicate-argument structure (verbs and nouns) – John kicked the ball Nominal relations – The red ball Relations between events / temporal relations – John kicked the ball and scored a goal Domain-dependent relations (MUC/ACE) – John works for IBM

5 PREDICATE/ARGUMENT STRUCTURE Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji ) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting... When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)

6 PREDICATE-ARGUMENT STRUCTURE Linguistic Theories – Case Frames – Fillmore  FrameNet – Lexical Conceptual Structure – Jackendoff  LCS – Proto-Roles – Dowty  PropBank – English verb classes (diathesis alternations) - Levin  VerbNet – Talmy, Levin and Rappaport

7 Fillmore’s Case Theory Sentences have a DEEP STRUCTURE with CASE RELATIONS A sentence is a verb + one or more NPs – Each NP has a deep-structure case A(gentive) I(nstrumental) D(ative) F(actitive) L(ocative) O(bjective) – Subject is no more important than Object Subject/Object are surface structure

8 THEMATIC ROLES Following on Fillmore’s original work, many theories of predicate argument structure / thematic roles were proposed, among which the best known perhaps – Jackendoff’s LEXICAL CONCEPTUAL SEMANTICS – Dowty’s PROTO-ROLES theory

9 Dowty’s PROTO-ROLES Event-dependent Prototypes based on shared entailments Grammatical relations such as subject related to observed (empirical) classification of participants Typology of grammatical relations Proto-Agent Proto-Patient

10 Proto-Agent Properties – Volitional involvement in event or state – Sentience (and/or perception) – Causing an event or change of state in another participant – Movement (relative to position of another participant) – (exists independently of event named) *may be discourse pragmatic

11 Proto-Patient Properties: – Undergoes change of state – Incremental theme – Causally affected by another participant – Stationary relative to movement of another participant – (does not exist independently of the event, or at all) *may be discourse pragmatic

12 Semantic role labels: Jan broke the LCD projector. break (agent(Jan), patient(LCD-projector)) cause(agent(Jan), change-of-state(LCD-projector)) (broken(LCD-projector)) agent(A) -> intentional(A), sentient(A), causer(A), affector(A) patient(P) -> affected(P), change(P),… Filmore, 68 Jackendoff, 72 Dowty, 91

13 VERBNET AND PROPBANK Dowty’s theory of proto-roles was the basis for the development of PROPBANK, the first corpus annotated with information about predicate-argument structure

14 PROPBANK REPRESENTATION a GM-Jaguar pact that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 give(GM-J pact, US car maker, 30% stake) a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

15 ARGUMENTS IN PROPBANK Arg0 = agent Arg1 = direct object / theme / patient Arg2 = indirect object / benefactive / instrument / attribute / end state Arg3 = start point / benefactive / instrument / attribute Arg4 = end point Per word vs frame level – more general?

16 FROM PREDICATES TO FRAMES In one of its senses, the verb observe evokes a frame called Compliance: this frame concerns people’s responses to norms, rules or practices. The following sentences illustrate the use of the verb in the intended sense: – Our family observes the Jewish dietary laws. – You have to observe the rules or you’ll be penalized. – How do you observe Easter? – Please observe the illuminated signs.

17 FrameNet FrameNet records information about English words in the general vocabulary in terms of 1.the frames (e.g. Compliance) that they evoke, 2.the frame elements (semantic roles) that make up the components of the frames (in Compliance, Norm is one such frame element), and 3.each word’s valence possibilities, the ways in which information about the frames is provided in the linguistic structures connected to them (with observe, Norm is typically the direct object). theta

18 NOMINAL RELATIONS

19 HISTORY

20 CLASSIFICATION SCHEMES FOR NOMINAL RELATIONS

21 ONE EXAMPLE (Barker et al1998, Nastase & Spakowicz 2003)

22 THE TWO-LEVEL TAXONOMY OF RELATIONS, 2

23 THE SEMEVAL-2007 CLASSIFICATION OF RELATIONS Cause-Effect: laugh wrinkles Instrument-Agency: laser printer Product-Producer: honey bee Origin-Entity: message from outer-space Theme-Tool: news conference Part-Whole: car door Content-Container: the air in the jar

24 CAUSAL RELATIONS

25 TEMPORAL RELATIONS

26 THE MUC AND ACE TASKS Modern research in relation extraction, as well, was kicked-off by the Message Understanding Conference (MUC) campaigns and continued through the Automatic Content Extraction (ACE) and Machine Reading follow- ups MUC: NE, coreference, TEMPLATE FILLING ACE: NE, coreference, relations

27 TEMPLATE-FILLING

28 EXAMPLE MUC: JOB POSTING

29 THE ASSOCIATED TEMPLATE

30 AUTOMATIC CONTENT EXTRACTION (ACE)

31 ACE: THE DATA

32 ACE: THE TASKS

33 RELATION DETECTION AND RECOGNITION

34 ACE: RELATION TYPES

35 OTHER PRACTICAL VERSIONS OF RELATION EXTRACTION Biomedical domain (BIONLP, BioCreative) Chemistry Cultural Heritage

36 THE TASK OF SEMANTIC RELATION EXTRACTION

37 SEMANTIC RELATION EXTRACTION: THE CHALLENGES

38 HISTORY OF RELATION EXTRACTION Before 1993: Symbolic methods (using knowledge bases) Since then: statistical / heuristic based methods – From 1995 to around 2005: mostly SUPERVISED – More recently: also quite a lot of UNSUPERVISED / SEMI SUPERVISED techniques

39 SUPERVISED RE: RE AS A CLASSIFICATION TASK Binary relations Entities already manually/automatically recognized Examples are generated for all sentences with at least 2 entities Number of examples generated per sentence is NC2 – Combination of N distinct entities selected 2 at a time

40 GENERATING CANDIDATES TO CLASSIFY

41 RE AS A BINARY CLASSIFICATION TASK

42 NUMBER OF CANDIDATES TO CLASSIFY – SIMPLE MINDED VERSION

43 THE SUPERVISED APPROACH TO RE Most current approaches to RE are kernel- based Different information is used – Sequences of words, e.g., through the GLOBAL CONTEXT / LOCAL CONTEXT kernels of Bunescu and Mooney / Giuliano Lavelli & Romano – Syntactic information through the TREE KERNELS of Zelenko et al / Moschitti et al – Semantic information in recent work

44 KERNEL METHODS: A REMINDER Embedding the input data in a feature space Using a linear algorithm for discovering non-linear patterns Coordinates of images are not needed, only pairwise inner products Pairwise inner products can be efficiently computed directly from X using a kernel function K:X×X→R

45 MODULARITY OF KERNEL METHODS

46 THE WORD-SEQUENCE APPROACH Shallow linguistic Information: – tokenization – Lemmatization – sentence splitting – PoS tagging Claudio Giuliano, Alberto Lavelli, and Lorenza Romano (2007), FBK-IRST: Kernel methods for relation extraction, Proc. Of SEMEVAL-2007

47 LINGUISTIC REALIZATION OF RELATIONS Bunescu & Mooney, NIPS 2005

48 WORD-SEQUENCE KERNELS Two families of “basic” kernels – Global Context – Local Context Linear combination of kernels Explicit computation – Extremely sparse input representation

49 THE GLOBAL CONTEXT KERNEL

50

51 THE LOCAL CONTEXT KERNEL

52 LOCAL CONTEXT KERNEL (2)

53 KERNEL COMBINATION

54 EXPERIMENTAL RESULTS Biomedical data sets – AIMed – LLL Newspaper articles – Roth and Yih SEMEVAL 2007

55 EVALUATION METHODOLOGIES

56 EVALUATION (2)

57 EVALUATION (3)

58 EVALUATION (4)

59 RESULTS ON AIMED

60 NON-SUPERVISED METHODS FOR RELATION EXTRACTION Unsupervised relation extraction: – Hearst – Other work on extracting hyponymy relations – Extracting other relations: Almuhareb and Poesio, Cimiano and Wenderoth Semi-supervised methods – KNOW-IT-ALL

61 HEARST 1992, 1998: USING PATTERNS TO EXTRACT ISA LINKS Intuition: certain constructions typically used to express certain types of semantic relations E.g., for ISA: – The seabass IS A fish – Swimming, running AND OTHER activities – Vehicles such as cars, trucks and bikes

62 TEXT PATTERNS FOR HYPONYMY EXTRACTION HEARST 1998: NP {, NP}* {,} or other NP bruises …… broken bones, and other INJURIES HYPONYM (bruise, injury) EVALUATION: 55.46% precision wrt WordNet

63 THE PRECISION / RECALL TRADEOFF X and other Y: high precision, low recall X isa Y: low precision, high recall

64 HEARST’ REQUIREMENTS ON PATTERNS

65 OTHER WORK ON EXTRACTING HYPONYMY Caraballo ACL 1999 Widdows & Dorow 2002 Pantel & Ravichandran ACL 2004

66 OTHER APPROACHES TO RE Using syntactic information Using lexical features

67 Syntactic information for RE Pros: – more structured information useful when dealing with long-distance relations Cons: – not always robust – (and not available for all languages)

68 Semi-supervised methods Hearst 1992: find new patterns by using initial examples as SEEDS This approach has been pursued in a number of ways – Espresso (Pantel and Pennacchiotti 2006) – OPEN INFORMATION EXTRACTION (Etzioni and colleagues)

69 THE GENERIC SEMI-SUPERVISED ALGORITHM 1.Start with SEED INSTANCES Depending on algorithm, seed may be hand-generated or automatically obtained 2.For each seed instance, extract patterns from corpus Choice of patterns depends on algorithm 3.Output the best patterns according to some metric 4.(Possibly) iterate steps 2-3

70 THE ESPRESSO SEMI-SUPERVISED ALGORITHM 1.Start with SEED INSTANCES Hand-chosen 2.For each seed instance, extract patterns from corpus Generalization of whole sentence 3.Output the best patterns according to some metric A metric based on PMI 3.Do iterate steps 2-3

71 KNOW-IT-ALL A system for ontology population developed by Oren Etzioni and collaborators at the University of Washington

72 KNOW-IT-ALL: ARCHITECTURE

73 INPUT

74 BOOTSTRAPPING This first step takes the input domain predicates and the generic extraction patterns and produces domain-specific extraction patterns

75 EXTRACTION PATTERNS

76 EXTRACTOR Uses domain-specific extraction patterns + syntactic constraints – In “Garth Brooks is a country singer”, country NOT extracted as an instance of the pattern “X is a NP” Produces EXTRACTIONS (= instances of the patterns that satisfy the syntactic constraints)

77 ASSESSOR Estimates the likelihood of an extraction using POINTWISE MUTUAL INFORMATION between the extracted INSTANCE and DISCRIMINATOR phrases E.g., INSTANCE: Liege DISCRIMINATOR PHRASES: “is a city”

78 ESTIMATING THE LIKELIHOOD OF A FACT P(f |  ) and P(f |  ) estimated using a set of positive and negative instances

79 TERMINATION CONDITION KNOW-IT-ALL could continue searching for instances – But for instance, COUNTRY has only around 300 instances Stop: Signal-to-Noise ratio – Number of high probability facts / Number of low probability ones

80 OVERALL ALGORITHM

81 EVALUATION 5 classes: CITY, US STATE, COUNTRY, ACTOR, FILM

82 IE IN PRACTICE: THE GOOGLE KNOWLEDGE GRAPH “A huge knowledge graph of interconnected entities and their attributes”. Amit Singhal, Senior Vice President at Google “A knowledge based used by Google to enhance its search engine’s results with semantic-search information gathered from a wide variety of sources” http://en.wikipedia.org/wiki/Knowledge_Graph 82

83 Based on information derived from many sources including Freebase, CIA World Factbook, Wikipedia Contains 570 million objects and more than 18 billion facts about and relationships between these different objects 83 INFORMATION IN THE GKG

84 84 Search for a person, place, or thing Facts about entities are displayed in a knowledge box on the right side INFORMATION IN THE GKG

85 What it looks like Web Results have not changed

86 What it looks like This is what’s new Map General info Upcoming Events Points of interest *The Type of information that appears in this panel depends on what you are searching for

87 Handling vague searches/homophones Prompt user to indicate more precisely exactly what it is they are looking for Displays results only relating to that meaning Eliminates other results in both the panel and web results

88 Example of this: Very General Results

89 Example of this: Shows Possible Results: Now user pick what they were looking for Lets assume user meant the TV show Kings

90 MORE COMPLEX SEMANTICS Modalities Temporal interpretation

91 ACKNOWLEDGMENTS Many slides borrowed from – Roxana Girju – Alberto Lavelli


Download ppt "807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction."

Similar presentations


Ads by Google