Presentation is loading. Please wait.

Presentation is loading. Please wait.

TextNet – A Text-Based Intelligent System Sanda Harabagiu Dan Moldovan as (mis-)interpreted by Peter Clark.

Similar presentations


Presentation on theme: "TextNet – A Text-Based Intelligent System Sanda Harabagiu Dan Moldovan as (mis-)interpreted by Peter Clark."— Presentation transcript:

1 TextNet – A Text-Based Intelligent System Sanda Harabagiu Dan Moldovan as (mis-)interpreted by Peter Clark

2 Introduction Overall goal: –Given a sentence/paragraph, create a representation of the unstated, extra knowledge (“context”) which it suggests. –Input: sentence graph; Output: bigger, richer graph Purpose: Question-answering etc. (?) Sources of this extra knowledge: –(Extended) WordNet –the Internet

3 WordNet Organized around concepts (“synsets”), not words Contains: –~100k concepts (“synsets”) – ~350k connections (14 types) –English definitions (“glosses”) for most synsets 132132 433243 “Game involving athletic activity.” “A game played with rackets by twp or four players who hit the ball over a net that divides the court.” {“tennis”, “lawn tennis”} {“athletic game”} isa

4 WordNet Organized around concepts (“synsets”), not words Contains: –~100k concepts (“synsets”) – ~350k connections (14 types) –English definitions (“glosses”) for most synsets athletic game tennis “Game involving athletic activity.” “A game played with rackets by twp or four players who hit the ball over a net that divides the court.” {“tennis”, “lawn tennis”} {“athletic game”} isa

5 Extended WordNet Disambiguate and transform glosses into network representations. tennis court “Tennis court: A court in which tennis is played.” playcourt tennis object location-ofdef {“tennis”, “lawn tennis”}

6 Extended WordNet Disambiguate and transform glosses into network representations. serve “Serve: A stroke in tennis that puts the ball in play.” play stroke tennis object agentdef ball put context manner

7 Extended WordNet Resulting structure is no longer just a big graph def “Raw” concepts (isa hierarchy, other relations) Concepts in context (particular subtypes/ situations for concepts) ball Original WordNetProcessed Glossary Definitions ball

8 Part I: Adding Relevant, Contextual Knowledge from WordNet kid hit ballhard agent object manner “The kid hit the ball very hard.”

9 Goals: –provide supplementary information about a sentence –explain relation between sentences Approach: –Deductive inference (e.g., “snore –entails  sleep”) –Find and add information into the sentence representation Challenge: –Many possible connections “Inference Extraction” kid hit ballhard agent object manner “The kid hit the ball very hard.”

10 Path-finding To find path(s) between A and B: use spreading activation/marker passing: –place markers at A and B –propogate markers to neighboring nodes –at quiescence, look for marker collisions “Propogation rules” determine when to propogate –“asymmetric and transitive relations are more useful” –“going up the isa hierarchy allows hierarchical deductions” –“the same is true for relations such as entail and causation. For example, if a man is snoring, then he is sleeping, and further he is temporarily unconscious.”

11 Find connections which “explain” these relations kid hit ballhard agent object manner “The kid hit the ball very hard.” hitgameplayerperson context play object-of agentisa kid isa hitgameplayer context play object-of agentagent-of hitball object within context of tennis within context of ball within context of tennis within context of ball

12 Find connections which “explain” these relations kid hit ballhard agent object manner “The kid hit the ball very hard.” within context of return hardreturn manner-ofgloss (“isa”) stroketennis context gloss (“isa”) gameplayerplay object-of agentagent-of hit within context of tennis within context of drive

13 object Inter-sentential Global Context Find connections between “local contexts” S1: The kid hit the ball very hard. S2: It landed almost always near the baseline. hit isa move gloss (“isa”) changelocation isa destinationarrive isa landreach gloss (“isa”) object within context of move within context of arrive place gloss (“isa”) within context of destination

14 Part II: Adding Contextual Knowledge from the Internet

15 Is WordNet (or a dictionary) sufficient to fully build the context? QN: Can we relate “GPS” and “hiking” using a dictionary? From Oxford Dictionary: –“GPS: a navigation system” –“Hiking: long walk in the countryside taken for pleasure” –“Walk: place or track or route for foot passengers” –“Route: course or way taken from starting point to destination” But: –Missing knowledge that hiking involves following/navigating a particular trail, as opposed to just wandering aimlessly “GPS systems are used for hiking.”

16 Finding and Adding Extra, Contextual Knowledge from the Internet WordNet doesn’t contain all the background K So can we add extra K using other texts too? –run-time, extra elaboration of current graph –further expansion of WordNet? Approach: 1.Start with some initial “seed” text 2.Retrieve paragraphs containing relevant words 3.Elaborate their “local and global contexts” 4.Determine relevance using a similarity measure 5.Select “the most appropriate new context” 6.Add its graph (or parts of it?) to the original graph

17 Finding Relevant Documents Two problems: –Discovery: Which keywords to search with? use words in the original seed text, or closely related words e.g., “play AND (tennis OR ball OR baseline) AND hit” –Quality: How relevant are the results? measure the degree of overlap of graphs for seed and new texts Lexical ambiguity is a root problem –Disambiguation by assuming new words belong to same/close synsets as in the original query (dubious!)

18 A Real Example… Text: about player who gets tendinis from hitting ball too hard Build initial graph of sentences (but info missing) Look for additional information on Internet 1.try multiple queries 2.select the best result (= graph most coherent with original text) 3.layer this graph on top of the original text graph Original text + WordNet: –hit –isa  affect  isa- injure –result  injury –hit –purpose  land –location  backline Internet text: –backline –result  ace WordNet –ace –isa  serve –attr  unreachable –purpose  win Hence (!) –“Winning is the motivation for actions causing tennis injuries”

19 Summary Interesting, ambitious Right idea (used by others too) Didn’t work (?); no further publications on TextNet Critical details not clear from the paper –Problem  finding good connections, rather = avoiding finding bad connections


Download ppt "TextNet – A Text-Based Intelligent System Sanda Harabagiu Dan Moldovan as (mis-)interpreted by Peter Clark."

Similar presentations


Ads by Google