1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: FOL Inference (Reading R&N: Chapter 8)

1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes gomes@cs.cornell.edu Module: FOL Inference (Reading R&N: Chapter 8)

2 Inference How to perform inference in First Order Logic? How to derive new information? “Similar” to propositional logic but it requires new “tricks” to deal with: quantifiers and variables. Unification  a substitution to match atomic sentences involving variables Skolemization  instantiations of existential variables to remove existential quantifiers

3 Outline Reducing first-order inference to propositional inference Unification Generalized Modus Ponens Forward chaining Backward chaining Resolution

4 Universal instantiation (UI) Every instantiation of a universally quantified sentence is entailed by it:  v α Subst({v/g}, α) for any variable v and ground term g E.g.,  x King(x)  Greedy(x)  Evil(x) yields: King(John)  Greedy(John)  Evil(John) King(Richard)  Greedy(Richard)  Evil(Richard) King(Father(John))  Greedy(Father(John))  Evil(Father(John)).

5 Existential instantiation (EI) For any sentence α, variable v, and constant symbol k that does not appear elsewhere in the knowledge base:  v α Subst({v/k}, α) E.g.,  x Crown(x)  OnHead(x,John) yields: Crown(C 1 )  OnHead(C 1,John) provided C 1 is a new constant symbol, called a Skolem constant

6 Reduction to propositional inference Suppose the KB contains just the following:  x King(x)  Greedy(x)  Evil(x) King(John) Greedy(John) Brother(Richard,John) Instantiating the universal sentence in all possible ways, we have: King(John)  Greedy(John)  Evil(John) King(Richard)  Greedy(Richard)  Evil(Richard) King(John) Greedy(John) Brother(Richard,John) The new KB is propositionalized: proposition symbols are King(John), Greedy(John), Evil(John), King(Richard), etc.

7 Reduction to Propositional Inference Every FOL KB can be propositionalized so as to preserve entailment (A ground sentence is entailed by new KB iff entailed by original KB) Idea: propositionalize KB and query, apply resolution, return result Often quite effective to propositionalize a theory to take advantage of fast propositional solvers!!!

8 But, at a more theoretical level…. Problem: with function symbols, there are infinitely many ground terms, –e.g., Father(Father(Father(John))) Theorem: Herbrand (1930). If a sentence α is entailed by a FOL KB, it is entailed by a finite subset of the propositionalized KB Problem: works if α is entailed, loops if α is not entailed Theorem: Turing (1936), Church (1936) Entailment for FOL is semidecidable (algorithms exist that say yes to every entailed sentence, but no algorithm exists that also says no to every nonentailed sentence.) Reduction to Propositional Inference Theoretical aspects beyond this course

9 Unification Dealing with variable: Unification: a substitution to match atomic sentences, that makes two clauses resolvable: Unify (P.Q) takes two atomic sentences P and Q and returns a substitution that makes P and Q look the same. Rules for substitutions: Can replace a variable by a constant. (v1  C;) Can replace a variable by a variable. (v2  v3; ) Can replace a variable by a function expression, as long as the function expression does not contain the variable. (v4  f(…))

10 Unification Given: Knows(John,x)  Hates(John,x) Knows(John,Jim) To perform resolution we need: Unifier  = {x/Jim} Hates(John,Jim)

Unification  Knows(John,x)  Hates(John,x) And Knows(John,Jim) How to resolve them? First match them Solution: UNIFY(Knows(John,x),Knows(John,Jim)) = {x/Jim}) Unifier  = {x/Jim} Gives  Knows(John,Jim)  Hates(John,Jim) And Knows(John,Jim) Conclude by resolution: Hates(John,Jim)

12 Unification General rule: Knows(John,x)  Hates(John,x) Facts: Knows(John, Jim) Knows(y, Leo) Knows(y, Mother(y)) Knows(y, Jane) Matching facts to the general rules

13 Unification: Standardizing Variables UNIFY(Knows(John,x),Knows(John,Jim)) = {x/Jim}) UNIFY(Knows(John,x),Knows(y,Leo)) = {x/Leo,y/John}) UNIFY(Knows(John,x),Knows(y,Mother(y))) = {y/John,x/Mother(John)}) UNIFY(Knows(John,x),Knows(x,Jane)) = fail  but intuitively we know that everyone John knows he hates and that everyone knows Jane so we should be able to infer that John hates Jane. This is why we require, if possible, that every variable has a separate name. UNIFY(Knows(John,x),Knows(y,Jane))  works!!! {y/John,x/Jane}) Standardizing apart eliminates overlap of variables, e.g., Knows(y,Jane)

14 Unification: Most General Unifier To unify Knows(John,x) and Knows(y,z), θ = {y/John, x/z } or θ = {y/John, x/z, z/Mary} or θ = {y/John, x/John, z/John} Choose the substitution that makes the least commitment (most general) about the bindings The first unifier is more general than the second. MGU = { y/John, x/z } There is a single most general unifier (MGU) that is unique up to renaming of variables. See page 277 and 278 for unification algorithm, O(n 2 ) (size of expressions being checked).

15 The unification algorithm

16 The unification algorithm

17 Generalized Modus Ponens (GMP) p 1 ', p 2 ', …, p n ', ( p 1  p 2  …  p n  q) qθ p 1 ' is King(John) p 1 is King(x) p 2 ' is Greedy(y) p 2 is Greedy(x) θ is {x/John,y/John} q is Evil(x) q θ is Evil(John) All variables assumed universally quantified Horn Definite clauses (exactly one positive literal) are a suitable normal form for use with GMP. where p i 'θ = p i θ for all i

18 Soundness of GMP Need to show that p 1 ', …, p n ', (p 1  …  p n  q) ╞ qθ provided that p i 'θ = p i θ for all I Lemma: For any sentence p, we have p ╞ pθ by UI 1.(p 1  …  p n  q) ╞ (p 1  …  p n  q)θ = (p 1 θ  …  p n θ  qθ) 2.p 1 ', \; …, \;p n ' ╞ p 1 '  …  p n ' ╞ p 1 'θ  …  p n 'θ 3.From 1 and 2, qθ follows by ordinary Modus Ponens

19 Example knowledge base The law says that it is a crime for an American to sell weapons to hostile nations. The country Nono, an enemy of America, has some missiles, and all of its missiles were sold to it by Colonel West, who is American. Prove that Colonel West is a criminal

Example knowledge base contd.... it is a crime for an American to sell weapons to hostile nations: American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Nono … has some missiles, i.e.,  x Owns(Nono,x)  Missile(x): Owns(Nono,M 1 ) and Missile(M 1 ) … all of its missiles were sold to it by Colonel West Missile(u)  Owns(Nono,u)  Sells(West,u,Nono) Missiles are weapons: Missile(v)  Weapon(v) An enemy of America counts as "hostile“: Enemy(t,America)  Hostile(t) West, who is American … American(West) The country Nono, an enemy of America … Enemy(Nono,America) All variables assumed universally quantified; quantifiers omitted. Added skolem constant M1 facts

21 Forward chaining algorithm

22 Forward chaining proof All variables assumed universally quantified; quantifiers omitted.

23 Forward chaining proof Missile(u)  Owns(Nono,u)  Sells(West,u,Nono) Enemy(t,America)  Hostile(t) Missile(v)  Weapon(v) All variables assumed universally quantified; quantifiers omitted. {t/Nono} {v/M1} {u/M1}

24 Forward chaining proof American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) All variables assumed universally quantified; quantifiers omitted. {u/Nono} {v/M1} {x/West,y/M1,z/Nono} {u/M1}

25 Properties of forward chaining Sound and complete for first-order Horn definite clauses Datalog = first-order definite clauses + no functions FC terminates for Datalog in finite number of iterations May not terminate in general if α is not entailed This is unavoidable: entailment with definite clauses is semidecidable Improvements: Incremental forward chaining Forward chaining is widely used in deductive databases

26 Efficiency of forward chaining Incremental forward chaining: no need to match a rule on iteration k if a premise wasn't added on iteration k-1  match each rule whose premise contains a newly added positive literal Matching itself can be expensive: Database indexing allows O(1) retrieval of known facts –e.g., query Missile(x) retrieves Missile(M 1 ) Forward chaining is widely used in deductive databases

Hard matching example Colorable() is inferred iff the CSP has a solution CSPs include 3SAT as a special case, hence matching is NP-hard Diff(wa,nt)  Diff(wa,sa)  Diff(nt,q)  Diff(nt,sa)  Diff(q,nsw)  Diff(q,sa)  Diff(nsw,v)  Diff(nsw,sa)  Diff(v,sa)  Colorable() Diff(Red,Blue) Diff (Red,Green) Diff(Green,Red) Diff(Green,Blue) Diff(Blue,Red) Diff(Blue,Green)

28 Backward chaining algorithm SUBST(COMPOSE(θ 1, θ 2 ), p) = SUBST(θ 2, SUBST(θ 1, p))

29 Backward chaining example

30 Backward chaining example American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)

31 Backward chaining example Missile(x)  Weapon(x) American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)

32 Backward chaining example Missile(x)  Weapon(x) American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)

33 Backward chaining example Missile(x)  Weapon(x) American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Missile(M1)

34 Backward chaining example Missile(x)  Weapon(x) American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Missile(M1) Owns(Nono,M1)

35 Backward chaining example Missile(x)  Weapon(x) American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Missile(M1) Owns(Nono,M1) Enemy(Nono,America) Note: once one subgoal succeeds in a conjunction, its substitution is applied to subsequent sub-goals. E.g. y is bound to M1 and and z is bound to Nono.

36 Backward chaining example Missile(x)  Weapon(x) American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Missile(M1) Owns(Nono,M1) Enemy(Nono,America)

37 Properties of backward chaining Depth-first recursive proof search: space is linear in size of proof Incomplete due to infinite loops –  fix by checking current goal against every goal on stack Inefficient due to repeated subgoals (both success and failure) –  fix using caching of previous results (extra space) Widely used for logic programming

38 Logic programming: Prolog Algorithm = Logic + Control Basis: backward chaining with Horn clauses + bells & whistles Widely used in Europe, Japan (basis of 5th Generation project) Interpreted or Compiled (intermediate language, e.g., Lisp C) Program = set of clauses = head :- literal 1, … literal n. criminal(X) :- american(X), weapon(Y), sells(X,Y,Z), hostile(Z). Depth-first, left-to-right backward chaining Built-in predicates for arithmetic etc., e.g., X is Y*Z+3 Built-in predicates that have side effects (e.g., input and output predicates, assert/retract predicates) Closed-world assumption ("negation as failure") –e.g., given alive(X) :- not dead(X). –alive(joe) succeeds if dead(joe) fails

39 Prolog (Programming in Logic) What is Prolog? –Full-featured programming language –Programs consist of logical formulas –Running a program means proving a theorem Syntax of Prolog –Predicates, objects, and functions: cat(tuna), append(a,pair(b)) –Variables: X, Y, List (capitalized) –Facts: university(cornell). prepend(a,pair(a,X)). –Rules: animal(X) :- cat(X). student(X) :- person(X), enrolled(X,Y), university(Y).  implication “:-” with single predicate on left and only non-negated predicates on the right. All variables implicitly “forall” quantified. –Queries: student(X).  All variables implicitly “exists” quantified.

40 Resolution

41 Resolution p  q  p  r  q  r Propositional logic FOL: “Similar” to propositional logic but it requires new “tricks” to deal with: quantifiers and variables.

42 Resolution 1 – Put in clausal form All variables universally quantified (standardize names apart) Main trick: “Skolemization” to remove existential quantifiers. Idea: Invent names for unknown objects known to exist. Two cases: Constant - existential variable not in the scope of any other variable  skolem constant Function – existential variable in the scope of other variables  Skolem function 2 – Use unification to match atomic sentences 3 – Apply resolution rule to the clausal set combine with the negated goal. Attempt to derive empty clause.

43 Eliminate Existential Quantifiers: Skolemization Existential variable not in the scope of any other variable Existential variable in the scope of other variable There is one argument for each universally quantified variable whose scope contains the Skolem function.

44 Resolution: brief summary The two clauses are assumed to be standardized apart so that they share no variables. For example,  Rich(x)  Unhappy(x) Rich(Ken) Unhappy(Ken) with θ = {x/Ken} Apply resolution steps to CNF(KB   α); refutation complete for FOL

45 Algorithm: Putting Axioms into Clausal Form Eliminate biconditionals and implications. Move the negations inwards. Eliminate the existential quantifiers. Rename the variables, if necessary. Move the universal quantifiers to the left. Move the disjunctions down to the literals. Eliminate the conjunctions. Rename the variables, if necessary. Eliminate the universal quantifiers.

46 Conversion to CNF Everyone who loves all animals is loved by someone:  x { [  y Animal(y)  Loves(x,y)]  [  y Loves(y,x)] } 1. Eliminate biconditionals and implications  x {  [  y Animal(y)  Loves(x,y)]  [  y Loves(y,x)]} (eliminate main implication)  x {[  y (  Animal(y)  Loves(x,y))]  [  y Loves(y,x)]} (eliminate other implication) 2. Move  inwards:  x p ≡  x  p,   x p ≡  x  p  x [  y  (  Animal(y)  Loves(x,y))]  [  y Loves(y,x)]  x [  y  Animal(y)   Loves(x,y)]  [  y Loves(y,x)] de Morgan’s law  x [  y Animal(y)   Loves(x,y)]  [  y Loves(y,x)] double negation

47 Conversion to CNF contd. Standardize variables: each quantifier should use a different one  x [  y Animal(y)   Loves(x,y)]  [  z Loves(z,x)] Skolemize: a more general form of existential instantiation. Each existential variable is replaced by a Skolem function of the enclosing universally quantified variables:  x [Animal(F(x))   Loves(x,F(x))]  Loves(G(x),x) Drop universal quantifiers: [Animal(F(x))   Loves(x,F(x))]  Loves(G(x),x) Distribute  over  : [Animal(F(x))  Loves(G(x),x)]  [  Loves(x,F(x))  Loves(G(x),x)]

48 Resolution: Example Jack owns a dog. Every dog owner is an animal lover. No animal lover kills an animal. Either Jack or Curiosity killed the cat, who is named Tuna. Did Curiosity kill the cat?

Original Sentences (Plus Background Knowledge) Jack owns a dog. Every dog owner is an animal lover. : : No animal lover kills an animal. Either Jack or Curiosity killed the cat, who is named Tuna. Skolem constant   x p ≡  x  p Theorem: Kills (Curiosity,Tuna) (v) (w,v)

Proof by Resolution  kills(Curiosity,Tuna) kills(Jack,Tuna)  kills(Curiosity,Tuna) kills(Jack,Tuna)  AnimalLover(w)   Animal(v)  kills(w,v) {}  AnimalLover(Jack)   Animal(Tuna) {w/Jack, v/Tuna}  AnimalLover(Jack)   Cat(Tuna) Animal(z)   Cat(z) {z/Tuna} Cat(Tuna)  AnimalLover(Jack) {}  Dog(y)   Owns(x,y)  AnimalLover(x) {x/Jack}  Dog(y)   Owns(Jack,y) Dog(D)  Owns(Jack,D) {y/D} Owns(Jack,D) NIL Negation of theorem

51 Resolution proof: definite clauses

52 Resolution is Refutation Complete Resolution with unification applied to clausal form, is refutation complete! Interesting proof! Based on building an “artificial” domain of interpretation called the Herbrand universe. See R&N pages 300-303.

53 Proofs can be Lengthy A relatively straightforward KB can quickly overwhelm general resolution methods. Theorem provers are in general based on resolution strategies that can reduce the problem somewhat, but not completely. As a consequence, many practical Knowledge Representation formalisms in AI use a restricted form and specialized inference. –Logic programming (Prolog) –Datalog – definite clauses, no functions –Production or expert systems (rule based systems) –Frame systems and semantic networks (chapter 10) –Description logics (chapter 10)

54 Successes in Rule-Based Reasoning Expert systems DENDRAL (Buchanan et al., 1969) MYCIN (Feigenbaum, Buchanan, Shortliffe) PROSPECTOR (Duda et al., 1979) R1 (McDermott, 1982)

55 Successes in Rule-Based Reasoning DENDRAL (Buchanan et al., 1969) –Infers molecular structure from the information provided by a mass spectrometer –Generate-and-test method

56 MYCIN (Feigenbaum, Buchanan, Shortliffe) –Diagnosis of blood infections –450 rules; performs as well as experts –Incorporated certainty factors Successes in Rule-Based Reasoning

57 PROSPECTOR (Duda et al., 1979) –Correctly recommended exploratory drilling at geological site –Rule-based system founded on probability theory R1 (McDermott, 1982) –Designs configurations of computer components –About 10,000 rules –Uses meta-rules to change context Successes in Rule-Based Reasoning

58 Prominent expert systems CADUCEUS (expert system)- Blood-borne infectious bacteria Dendral- Analysis of mass spectra Jess- Java Expert System Shell. A CLIPS engine implemented in Java used in the development of expert systems LogicNets- Web based expert system modeling environment to create expert systems (in collaboration with NASA) Mycin - Diagnose infectious blood diseases and recommend antibiotics (by Stanford University) NEXPERT Object- An early general-purpose commercial backwards- chaining inference engine used in the development of expert systems Prolog- Programming language used in the development of expert systems R1 (expert system)/XCon Order processing STD Wizard - Expert system for recommending medical screening tests PyKe- Pyke is a knowledge-based inference engine (expert system)

1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: FOL Inference (Reading R&N: Chapter 8)

Similar presentations

Presentation on theme: "1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: FOL Inference (Reading R&N: Chapter 8)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: FOL Inference (Reading R&N: Chapter 8)

Similar presentations

Presentation on theme: "1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: FOL Inference (Reading R&N: Chapter 8)"— Presentation transcript:

Similar presentations

About project

Feedback