Presentation is loading. Please wait.

Presentation is loading. Please wait.

Three Stories on Automated Reasoning for Natural Language Understanding Johan Bos University of Rome "La Sapienza Dipartimento di Informatica.

Similar presentations


Presentation on theme: "Three Stories on Automated Reasoning for Natural Language Understanding Johan Bos University of Rome "La Sapienza Dipartimento di Informatica."— Presentation transcript:

1 Three Stories on Automated Reasoning for Natural Language Understanding Johan Bos University of Rome "La Sapienza Dipartimento di Informatica

2 Background My work is in between –natural language processing –computational linguistics –formal and computational semantics Aim of my work –implement linguistic theories –use automated reasoning in modeling natural language understanding

3 Applications What kind of applications? –Human-machine dialogue systems –Question answering systems –Textual entailment systems Use of logical inference –Off the shelf systems, FOL theorem provers and finite model builders –Empirically successful?

4 Surprise Perhaps surprisingly, automated reasoning tools rarely make it into NLP applications Why? –Requires interdisciplinary background –Gap between formal semantic theory and practical implementation –It is just not trendy --- statistical approaches dominate the field

5 Three Stories World Wide Computational Semantics –The worlds first serious implementation of Discourse Representation Theory, with the help of the web and theorem proving Godot, the talking robot –The first robot that computes semantic representations and performs inferences using theorem proving and model building Recognising Textual Entailment –Automated deduction applied in wide-coverage natural language processing

6 The First Story World Wide Computational Semantics The first serious implementation of Discourse Representation Theory, with the help of the internet…

7 How it started Implementing tools for the semantic analysis of English Follow linguistic theory as closely as possible –Discourse Representation Theory [DRT] –First-order logic –Presupposition projection Computational Semantics

8 How can we automate the process of associating semantic representations with expressions of natural language? How can we use logical representations of natural language expressions to automate the process of drawing inferences?

9 Basic idea Text: Vincent loves Mia.

10 Basic idea Text: Vincent loves Mia. DRT: x y vincent(x) mia(y) love(x,y)

11 Basic idea Text: Vincent loves Mia. DRT: FOL: x y(vincent(x) & mia(y) & love(x,y)) x y vincent(x) mia(y) love(x,y)

12 Basic idea Text: Vincent loves Mia. DRT: FOL: x y(vincent(x) & mia(y) & love(x,y)) BK: x (vincent(x) man(x)) x (mia(x) woman(x)) x (man(x) woman(x)) x y vincent(x) mia(y) love(x,y)

13 Basic idea Text: Vincent loves Mia. DRT: FOL: x y(vincent(x) & mia(y) & love(x,y)) BK: x (vincent(x) man(x)) x (mia(x) woman(x)) x (man(x) woman(x)) Model: D = {d1,d2} F(vincent)={d1} F(mia)={d2} F(love)={(d1,d2)} x y vincent(x) mia(y) love(x,y)

14 ? Text: Vincent loves Mia. DRT: x y vincent(x) mia(y) love(x,y)

15 Compositional Semantics The Problem Given a natural language expression, how do we convert it into a logical formula? Freges principle The meaning of a compound expression is a function of the meaning of its parts.

16 Lexical semantics CategorySemanticsExample N x. spokesman NP/N p. q.(( ;p(x));q(x)) a NP/N p. q.(( ;p(x)) q(x)) the S\NP y. lied spokesman(x) X e lie(e) agent(e,y) X

17 A derivation NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) z. y. spokesman(z) x e lie(e) agent(e,y)

18 A derivation NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) z. y (FA) NP: a spokesman p. q. ;p(x);q(x)( z. ) spokesman(z) x e lie(e) agent(e,y) x

19 A derivation NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) z. y (FA) NP: a spokesman q. ; ;q(x)) spokesman(z) x spokesman(x) e lie(e) agent(e,y) x

20 A derivation NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) z. y (FA) NP: a spokesman q. ;q(x) spokesman(z) x x spokesman(x) e lie(e) agent(e,y)

21 A derivation NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) x. y (FA) NP: a spokesman q. ;q(x) (BA) S: a spokesman lied q. ;q(x)( y. ) spokesman(z) x x spokesman(x) e lie(e) agent(e,y) e lie(e) agent(e,y) x spokesman(x)

22 A derivation NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) x. y (FA) NP: a spokesman q. ;q(x) (BA) S: a spokesman lied ; spokesman(z) x x spokesman(x) e lie(e) agent(e,y) e lie(e) agent(e,x) x spokesman(x)

23 A derivation NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) x. y (FA) NP: a spokesman q. ;q(x) (BA) S: a spokesman lied spokesman(x) x x e lie(e) agent(e,y) x e spokesman(x) lie(e) agent(e,x)

24 The DORIS System Reasonable grammar coverage Parsed English sentences, followed by resolving ambiguities –Pronouns –Presupposition Generated many different semantic representation for a text

25 Texts and Ambiguity Usually, ambiguities cause many possible interpretations Example: Butch walks into his modest kitchen. He opens the refrigerator. He takes out a milk and drinks it.

26 Texts and Ambiguity Usually, ambiguities cause many possible interpretations Example: Butch walks into his modest kitchen. He opens the refrigerator. He takes out a milk and drinks it.

27 Texts and Ambiguity Usually, ambiguities cause many possible interpretations Example: Butch walks into his modest kitchen. He opens the refrigerator. He takes out a milk and drinks it.

28 Texts and Ambiguity Usually, ambiguities cause many possible interpretations Example: Butch walks into his modest kitchen. He opens the refrigerator. He takes out a milk and drinks it.

29 Basic idea of DORIS Given a text, produce as many different DRSs [semantic interpretations] as possible Filter out `strange` interpretations –Inconsistent interpretations –Uninformative interpretations Applying theorem proving –Use general purpose FOL theorem prover –Bliksem [Hans de Nivelle]

30 Screenshot

31 Consistency checking Inconsistent text: –Mia likes Vincent. –She does not like him. Two interpretations, only one consistent: –Mia likes Jody. –She does not like her.

32 Informativity checking Uninformative text: –Mia likes Vincent. –She likes him. Two interpretations, only one informative: –Mia likes Jody. –She likes her.

33 Local informativity Example: –Mia is the wife of Marsellus. –If Mia is the wife of Marsellus, Vincent will be disappointed. The second sentence is informative with respect to the first. But…

34 x y mia(x) marsellus(y) wife-of(x,y) Local informativity

35 x y z mia(x) marsellus(y) wife-of(x,y) vincent(z) wife-of(x,y) disappointed(z) Local informativity

36 Local consistency Example: –Jules likes big kahuna burgers. –If Jules does not like big kahuna burgers, Vincent will order a whopper. The second sentence is consistent with respect to the first. But…

37 x y jules(x) big-kahuna-burgers(y) like(x,y) Local consistency

38 x y z jules(x) big-kahuna-burgers(y) like(x,y) vincent(z) u order(z,u) whopper(u) Local consistency like(x,y)

39 Studying Presupposition The DORIS system allowed one to study the behaviour of presupposition Examples such as: –If Mia has a husband, then her husband is out of town. –If Mia is married, then her husband is out of town. –If Mia is dating Vincent, then her husband is out of town.

40 Applying Theorem Proving The first version of DORIS sort of worked, but… –Many readings to start with, explosion… –The local constraints added a large number of inference tasks –It could take about 10 minutes for a complex sentence

41 MathWeb MathWeb [by Kohlhase & Franke] came to the rescue Theorem proving services via the internet Interface Doris with MathWeb At the time this was a sensation! What happened exactly?

42 World Wide Computational Semantics Just in order to find out whether Mia was married or not, thousands of computers world wide were used…

43 What can we say about DORIS? Demonstrated that FOL could play an interesting role in natural language understanding Helped to develop presupposition theory Generated a new application area for automated deduction [TPTP] Limitations –Scalability –Coverage

44 Six feet under The DORIS system Why? –Limited grammatical coverage –Unfocussed application domain –It would take at least 20 years to develop a parser that was both robust and accurate [at least that was my belief]

45 The Second Story Godot the talking robot Or how a mobile, talking robot engaged in conversation using theorem provers and model builders for first-order logic…

46 Human-Computer Dialogue Focus on small domains –Grammatical coverage ensured –Background knowledge encoding Spoken Dialogue system = killer app –Godot the robot –Speech recognition and synthesis –People could give Godot directions, ask it questions, tell it new information –Godot was a REAL robot

47 Godot the Robot Godot with Tetsushi Oka

48 Typical conversation with Godot Simple dialogues in beginning: –Human: Robot? –Robot: Yes? –Human: Where are you? –Robot: I am in the hallway. –Human: OK. Go to the rest room!

49 Adding inference to Godot Using the components of DORIS, we added semantics and inference –Using DRT –Inference tasks: Consistency checking Informativity checking –Using FOL theorem provers and finite model builders

50 Advanced conversation with Godot Dealing with inconsistency and informativeness –Human: Robot, where are you? –Robot: I am in the hallway. –Human: You are in my office. –Robot: No, that is not true. –Human: You are in the hallway. –Robot: Yes I know!

51 Parallel inference Given a text T and a translation into FOL –If theorem prover finds a proof for, then T is inconsistent –If model builder finds a model for, then T is consistent In other words, this is the Yin and Yen of inference

52 Why is this relevant to natural language? Testing a discourse for consistency DiscourseTheorem proverModel builder

53 Why is this relevant to natural language? Testing a discourse for consistency DiscourseTheorem proverModel builder Vincent is a man. ??Model

54 Why is this relevant to natural language? Testing a discourse for consistency DiscourseTheorem proverModel builder Vincent is a man. ??Model Mia loves every man. ??Model

55 Why is this relevant to natural language? Testing a discourse for consistency DiscourseTheorem proverModel builder Vincent is a man. ??Model Mia loves every man. ??Model Mia does not love Vincent. Proof??

56 Why is this relevant to natural language? Testing a discourse for informativity DiscourseTheorem proverModel builder

57 Why is this relevant to natural language? Testing a discourse for informativity DiscourseTheorem proverModel builder Vincent is a man. ??Model

58 Why is this relevant to natural language? Testing a discourse for informativity DiscourseTheorem proverModel builder Vincent is a man. ??Model Mia loves every man. ??Model

59 Why is this relevant to natural language? Testing a discourse for informativity DiscourseTheorem proverModel builder Vincent is a man. ??Model Mia loves every man. ??Model Mia loves Vincent. Proof??

60 Minimal Models Model builders normally generate models by iteration over the domain size As a side-effect, the output is a model with a minimal domain size From a linguistic point of view, this is interesting, as there is no redundant information Minimal in extensions

61 Using models Examples: Turn on a light. Turn on every light. Turn on everything except the radio. Turn off the red light or the blue light. Turn on another light.

62 Videos of Godot Video 1: Godot in the basement of Bucceuch Place Video 2: Screenshot of dialogue manager with DRSs and camera view of Godot

63 What can we say about Godot? Demonstrated that FOL could play an interesting role in human machine dialogue systems Also showed a new application of finite model building Domain known means all background knowledge known Limitations –Scalability, only `small` dialogues –Lack of incremental inference –Minimal models required

64 Godot the Robot [later] Godot at the Scottish museum

65 The Third Story 2005-present Recognising Textual Entailment Or how first-order automated deduction is applied to wide- coverage semantic processing of texts

66 Recognising Textual Entailment What is it? –A task for NLP systems to recognise entailment between two (short) texts –Proved to be a difficult, but popular task. Organisation –Introduced in 2004/2005 as part of the PASCAL Network of Excellence, RTE-1 –A second challenge (RTE-2) was held in 2005/2006 –PASCAL provided a development and test set of several hundred examples

67 RTE Example (entailment) RTE 1977 (TRUE) His family has steadfastly denied the charges The charges were denied by his family.

68 RTE Example (no entailment) RTE 2030 (FALSE) Lyon is actually the gastronomical capital of France Lyon is the capital of France.

69 Aristotles Syllogisms All men are mortal. Socrates is a man Socrates is mortal. ARISTOTLE 1 (TRUE)

70 Five methods Five different methods to RTE Ranging in sophistication from very basic to advanced

71 Recognising Textual Entailment Method 1: Flipping a coin

72 Advantages –Easy to implement –Cheap Disadvantages –Just 50% accuracy

73 Recognising Textual Entailment Method 2: Calling a friend

74 Advantages –High accuracy (95%) Disadvantages –Lose friends –High phone bill

75 Recognising Textual Entailment Method 3: Ask the audience

76 RTE 893 (????) The first settlements on the site of Jakarta were established at the mouth of the Ciliwung, perhaps as early as the 5 th century AD The first settlements on the site of Jakarta were established as early as the 5 th century AD.

77 Human Upper Bound RTE 893 (TRUE) The first settlements on the site of Jakarta were established at the mouth of the Ciliwung, perhaps as early as the 5 th century AD The first settlements on the site of Jakarta were established as early as the 5 th century AD.

78 Recognising Textual Entailment Method 4: Word Overlap

79 Word Overlap Approaches Popular approach Ranging in sophistication from simple bag of word to use of WordNet Accuracy rates ca. 55%

80 Word Overlap Advantages –Relatively straightforward algorithm Disadvantages –Hardly better than flipping a coin

81 RTE State-of-the-Art Pascal RTE challenge Hard problem Requires semantics

82 Recognising Textual Entailment Method 5: Semantic Interpretation

83 Basic idea Given a textual entailment pair T/H with text T and hypothesis H: –Produce DRSs for T and H –Translate these DRSs into FOL –Generate Background Knowledge in FOL Use ATPs to determine the likelyhood of entailment

84 Wait a minute This requires that we have the means to produce semantic representations [DRSs] for any kind of English input Recall DORIS experience Do we have English parsers at our disposal that do this?

85 Robust Parsing Rapid developments in statistical parsing the last decades These parsers are trained on large annotated corpora [tree banks] Yet most of these parsers produced syntactic analyses not suitable for systematic semantic work This changed with the development of CCG bank and a fast CCG parser

86 Implementation CCG/DRT Use standard statistical techniques –Robust wide-coverage parser –Clark & Curran (ACL 2004) Grammar derived from CCGbank –409 different categories –Hockenmaier & Steedman (ACL 2002) Compositional Semantics, DRT –Wide-coverage semantics –Bos (IWCS 2005)

87 Example Output Example: Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29. Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group. Semantic representation, DRT Complete Wall Street Journal

88 Using Theorem Proving Given a textual entailment pair T/H with text T and hypothesis H: –Produce DRSs for T and H –Translate these DRSs into FOL –Give this to the theorem prover: T H If the theorem prover finds a proof, then we predict that T entails H

89 Vampire (Riazanov & Voronkov 2002) Lets try this. We will use the theorem prover Vampire This gives us good results for: –apposition –relative clauses –coodination –intersective adjectives/complements –passive/active alternations

90 Example (Vampire: proof) On Friday evening, a car bomb exploded outside a Shiite mosque in Iskandariyah, 30 miles south of the capital A bomb exploded outside a mosque. RTE (TRUE)

91 Example (Vampire: proof) Initially, the Bundesbank opposed the introduction of the euro but was compelled to accept it in light of the political pressure of the capitalist politicians who supported its introduction The introduction of the euro has been opposed. RTE (TRUE)

92 Background Knowledge However, it doesnt give us good results for cases requiring additional knowledge –Lexical knowledge –World knowledge We will use WordNet as a start to get additional knowledge All of WordNet is too much, so we create MiniWordNets

93 MiniWordNets –Use hyponym relations from WordNet to build an ontology –Do this only for the relevant symbols –Convert the ontology into first-order axioms

94 MiniWordNet: an example Example text: There is no asbestos in our products now. Neither Lorillard nor the researchers who studied the workers were aware of any research on smokers of the Kent cigarettes.

95 MiniWordNet: an example Example text: There is no asbestos in our products now. Neither Lorillard nor the researchers who studied the workers were aware of any research on smokers of the Kent cigarettes.

96

97 x(user(x) person(x)) x(worker(x) person(x)) x(researcher(x) person(x))

98 x(person(x) risk(x)) x(person(x) cigarette(x)) …….

99 Using Background Knowledge Given a textual entailment pair T/H with text T and hypothesis H: –Produce DRS for T and H –Translate drs(T) and drs(H) into FOL –Create Background Knowledge for T&H –Give this to the theorem prover: (BK & T) H

100 MiniWordNets at work Background Knowledge: x(soar(x) rise(x)) Crude oil prices soared to record levels Crude oil prices rise. RTE 1952 (TRUE)

101 Troubles with theorem proving Theorem provers are extremely precise. They wont tell you when there is almost a proof. Even if there is a little background knowledge missing, Vampire will say: don`t know

102 Vampire: no proof RTE 1049 (TRUE) Four Venezuelan firefighters who were traveling to a training course in Texas were killed when their sport utility vehicle drifted onto the shoulder of a Highway and struck a parked truck Four firefighters were killed in a car accident.

103 Using Model Building Need a robust way of inference Use model builders Mace, Paradox –McCune –Claessen & Sorensson (2003) Use size of (minimal) model –Compare size of model of T and T&H –If the difference is small, then it is likely that T entails H

104 Using Model Building Given a textual entailment pair T/H with text T and hypothesis H: –Produce DRSs for T and H –Translate these DRSs into FOL –Generate Background Knowledge –Give this to the Model Builder: i) BK & T ii) BK & T & H If the models for i) and ii) are similar, then we predict that T entails H

105 Model similarity When are two models similar? –Small difference in domain size –Small difference in predicate extensions

106 Example 1 T: John met Mary in Rome H: John met Mary Model T: 3 entities Model T+H: 3 entities Modelsize difference: 0 Prediction: entailment

107 Example 2 T: John met Mary H: John met Mary in Rome Model T: 2 entities Model T+H: 3 entities Modelsize difference: 1 Prediction: no entailment

108 Model size differences Of course this is a very rough approximation But it turns out to be a useful one Gives us a notion of robustness Negation –Give not T and not [T & H] to model builder Disjunction –Not necessarily one unique minimal model

109 How well does this work? We tried this at the RTE-1 and RTE-2 Using standard machine learning methods to build a decision tree using features –Proof (yes/no) –Domain size difference –Model size difference Better than baseline, still room for improvement

110 RTE Results 2004/5 Accuracy Shallow0.569 Deep0.562 Hybrid (S+D)0.577 Hybrid+Task0.612 Bos & Markert 2005

111 RTE State-of-the-Art Pascal RTE challenge Hard problem Requires semantics

112 What can we say about RTE? We can use FOL inference techniques successfully There might be an interesting role for model building The bottleneck is getting the right background knowledge

113 Lack of Background Knowledge RTE (TRUE) Indonesia says the oil blocks are within its borders, as does Malaysia, which has also sent warships to the area, claiming that its waters and airspace have been violated There is a territorial waters dispute.

114 Winding up Summary Conclusion Shameless Plug Future

115 Summary Use of first order inference tools has a major influence on how computational semantics is perceived today –Implementations used in pioneering work of using first-order inference in NLP –Implementations used in spoken dialogue systems –Now also used in wide-coverage NLP systems

116 Conclusions We have got the tools for doing computational semantics in a principled way using DRT For many applications, success depends on the ability to systematically generate background knowledge –Small restricted domains [dialogue] –Open domain Finite model building has potential Incremental inference

117 Shameless Plug For more on the basic architecture underlying this work on computational semantics, and particular on implementations on the lambda calculus, and parallel use of theorem provers and model builders, see:


Download ppt "Three Stories on Automated Reasoning for Natural Language Understanding Johan Bos University of Rome "La Sapienza Dipartimento di Informatica."

Similar presentations


Ads by Google