Resolution Theorem Prover in First-Order Logic

Resolution Theorem Prover in First-Order Logic
Based on the slides of Thorsten Joachims

Outline Last lecture: Today’s lecture:
Some definitions for model-based diagnosis Reiter’s MBD algorithm using HS-trees Today’s lecture: First order logic and knowledge representation Resolution proof Unification Resolution as search

First-Order Logic Objects: cs472, fred, ph219, emptylist …
Relations/Predicates: is_Man(fred), Located(cs472, ph219) … Note: Relations typically correspond to verbs Connectives: , , , ,  Quantifiers: Universal: x: ( is_Man(x)  is_Mortal(x) ) Existential: y: ( is_Father(y, fred) ) Why is propositional logic not expressive enough? Example: A=FredIsFatherOfLaura, B=KarlIsFatherOfFred C=JohnIsFatherOfKelly, D=KellyIsMotherOfTodd  Cannot define general rule for parent like “x is parent of y if x is father of y or x is mother of y”.  Cannot define general rule for grandfather like “x Grandfather of z if x is father of y and y is father of z or y is mother of z”. Propositional logic: assume that the world consists only of facts/propositions FOL (predicate calculus): recognizes that the world consists of object with properties that distinguish them from one another; allows the use of predicates to define the relations that hold among objects. Functions are a way to automatically construct object names (we will mostly ignore functions here). Propositions represent facts about the world or objects. Operators or connectives relate those facts to one another. A predicate describes a relation or operation on a variable. is-father-of(John, Kelly). What about representing the classic sentence: “All men are mortal”. Need variables, quantification. Need 1st-order predicate logic. With objects, we’ll want a way to express properties of collections of objects. Hence, the need for quantifiers. Socrates is a object/constant. Is_Man is a predicate. Example: forall x forall y: is_Father(x, y)  is_Parent(x, y) forall x forall y: is_Mother(x, y)  is_Parent(x, y) forall x forall y forall z : is_Father(x, y) and is_Father/is_Parent(y, z)  is_Grandfather(x, z)

Example: Representing Facts in FOL
Lucy is a professor All professors are people. Fuchs is the dean. Deans are professors. All professors consider the dean a friend or don’t know him. Everyone is a friend of someone. People only criticize people that are not their friends. Lucy criticized Fuchs. On board. Rich, p. 134 To solve problems in logic framework: (1) represent “world” (2) proof methods. Focus on computational aspects – how would we get a machine to do it? Look at a system commissioned by the dean’s office for deciding who (which new faculty) will back Fuch’s policies and who won’t... is-prof(lucy) Vx: ( is-prof(x) → is_person(x) ) Ignored the fact that proper names may refer to more than 1 individual. Same name. Sometimes deciding which of several people of same name is being referred to in a statement is difficult (need to do this in a semantic net as well – lucy can’t both be a cartoon character and a CS Prof – so there must be two individuals named lucy). is-dean(fuchs) Vx: (is-dean(x)  is-prof(x)) Or can be inclusive or exclusive. (inclusive representation shown here) Seem like the exclusive or meaning is what’s really meant. (show that statement as well.) Vx ( Vy: ( is-prof(x) and is-dean(y) → is-friend-of(y,x) V \neg know(x, y) ) ) Note: read is-friend-of(y,x) as “y is a friend of x” 5. Ambiguous statement. Big problem for NLP system in determining the scope of quantifiers. (1) For each person, there exists someone to who he/she is a friend. Vx: \exists y: is-friend-of (y,x) (2) exists y: for all x: is-friend-of(y,x)…exists someone whom everybody considers a friend. Vx: ( \exists y: ( is-friend-of (y, x) ) ) 6. Ambiguous sentence again. Vx: Vy: is-person(x) ^ is-person (y) ^ criticize (x,y) → \neg is-friend-of (y,x) 7. criticize(lucy,fuchs)  Should be clear how difficult it is to convert sentences into logical statements. 9. Suppose we want to use the statements to answer the question: Is lucy friend of Fuchs? is-friend-of(fuchs,lucy) ?

Example: knowledge base
is-prof(lucy) x ( is-prof(x)  is-person(x) ) is-dean(fuchs) x (is-dean(x)  is-prof(x)) x (y ( is-prof(x)  is-dean(y)  is-friend-of(y,x)  knows(x, y) ) ) x (y ( is-friend-of (y, x) ) ) x (y (is-person(x)  is-person(y)  criticize (x,y)  is-friend-of (y,x))) criticize(lucy,fuchs) Question: Is Fuchs is friend of Lucy? is-friend-of(fuchs,lucy) Proof by Backward Chaining: Goal: \neg is-friend-of(fuchs,lucy) Modus Ponens: is-person(lucy) \wegde is-person(fuchs) \wedge criticize(lucy,fuchs) Known: is-person(lucy) \wegde is-person(fuchs) Modus Ponens: is-prof(lucy) \wegde is-person(fuchs) Known: is-person(fuchs) Modus Ponens: is-prof(fuchs) Modus Ponens: is-dean(fuchs) Known:

Some definitions for model-based diagnosis Reiter’s MBD algorithm using HS-trees Today’s lecture: First order logic and knowledge representation Resolution proof Unification Resolution as search The goal of this lecture is to present a practical voting for coordination in MAS. I emphasize practical, since most of the works in voting focus on theoretical aspects, while this work tries to examine voting with minimal communication. We will open the talk by presenting the importance of voting in MAS. The we will discuss how to find winners where the agents send incomplete set of preferences. We will present two algorithms, the first finds the optimal winner and save approximately 50% of the communications. The second algorithm is greedy and so not optimal, but saves approximately 90% of the communication but finds a winner which is very close to the winner in complete setting.

Resolution Rule of Inference
General Rule: Note: Eij can be negated. Example: Resolution – proof procedure that is complete (for refutation proofs). It exploits the resolution rule listed here to figure out what’s left when you consider a literal and its negation. Show resolution rule. Is this really true? Suppose e2 is true…Suppose e2 is false. So e1 v e3 is true as long as we know that both of the other expressions are true. e1 v e3 is called the resolvent. It’s easy to generalize resolution s.t. there can be any number of disjuncts (including just one), in either of the two resolving expressions. The only demand is that one resolving expression must have a disjunct that is the negation of a disjunct in the other resolving expression.

Algorithm: Resolution Proof
Negate the theorem to be proved, and add to the KB. Convert KB into conjunctive normal form (CNF) CNF: conjunctions of disjunctions Each disjunction is called a clause. Until there is no resolvable pair of clauses, Find resolvable clauses and resolve them. Add the results of resolution to the knowledge base. If NIL (empty clause) is produced, stop and report that the (original) theorem is true. Report that the (original) theorem is false. Resolution proofs perform a proof by contradiction/refutation. Conjunctive normal form (also sometimes called clause form). Every sentence is a disjunction of literals; axioms in KB have an implicit AND between sentences.

Resolution Example: FOL
Example: Prove bird(tweety) Axioms: Regular 1: 2: 3: 4: CNF NB Add negation Resolution: more complicated in FOL. You can use resolution to reach the same conclusions as you would with modeus ponens (and substitution). Shows axiom 1 and 2. Rewrite 1 in clause form: $\forall x: \neg feathers(x) \vee bird (x)$ 3: Add negation of sentence to prove: $\neg bird(tweety)$ 4: Resolve 3 and 1, specializing Tweety for x. Get $\neg feathers(tweety)$ 5: Resolve 4 and 2. Get NIL. kjyt So conclude bird (tweety). Modus ponens can be viewed as a special case of resolution. Resolution Proof Resolve 3 and 1, specializing (i.e. “unifying”) tweety for x. Add :feathers(tweety) Resolve 4 and 2. Add NIL: conclude bird(tweety)

Resolution Theorem Proving
Properties of Resolution Theorem Proving: sound (for propositional and FOL) (refutation) complete (for propositional and FOL) Procedure may seem heavy but note that can be easily automated. Just “smash” clauses until empty clause or no more new clauses. Proof procedure is sound and complete (for propositional and FOL). Why is method with multiple rules of inference more difficult to implement? What about length of resolution proof? Consider Pigeon-Hole (PH) problem: Formula encodes that you cannot place n+1 pigeons in n holes (one per hole). Cook/Karp around 1971/1972. “Resolved” by Armin Haken Related to NP vs. co-NP questions. PH takes exponentially many steps! (no matter in what order.) PH hidden in many practical problems. Makes theorem-proving expensive. Partly, led to recent move to model-based methods (NP-complete). Resolution proof of inconsistency requires at least an exponential number of clauses, no matter in what order how you resolve things! “Method can’t count.” Some inference rules for propositional logic: modus ponens, and-elimination (from a conjunction, can infer any conjuncts), or-elimination (from a sentence, can infer its disjunction w/ anything else, double negative elimination, universal elimination, existential elimination and introduction.

Unification Unify procedure: Unify(P,Q) takes two atomic (i.e. single predicates) sentences P and Q and returns a substitution that makes P and Q identical. Rules for substitutions: Can replace a variable by a constant. Can replace a variable by a variable. Can replace a variable by a function expression, as long as the function expression does not contain the variable. Unifier: a substitution that makes two clauses resolvable. To resolve two clauses, two literals must match exactly, except that one is negated. Sometimes the literals match exactly as they are, but other times one can be made to match the other by an appropriate substitution. This requires unification. (We’ve actually seen how to perform unification when we looked at backward chaining in rule-based systems.) Denote substitutions as shown. Variable v1 is replaced by the constant c; variable v4 is replaced by the function f and its arguments.

Unification - Purpose Given: Knows (John, x)  Hates (John, x)
Knows (John, Jim) Derive: Hates (John, Jim) Unification: Need unifier {x/Jim} for resolution to work. Add to knowledge base: We’ll look at unification in the context of some examples. Given: John hates everyone he knows; and John knows Jim. Need: substitution of Jim for x

Unification (example)
Use KB to find “who does John hate?” x: Hates(John, x) Knowledge base (in clause form): Knows (John, v)  Hates (John, v) Knows (John, Jim) DB Knows (y, Leo) Knows (z, Mother(z)) Hates (John, x) (since : x: Hates (John, x)  x: Hates (John,x)) Resolution with 5 and 1: unify(Hates(John, x), Hates(John, v)) = {x/v} Knows (John, v) Resolution with 6 and 2: unify(Knows(John, v), Knows(John, Jim))= {v/Jim} or resolution with 6 and 3: unify(Knows (John, v), Knows (y, Leo)) = {y/John, v/Leo} or Resolution with 6 and 4: unify(Knows (John, v), Knows (z, Mother(z))) = {z/John, v/Mother(z)} A more complicated example: Want to use the KB to find out who John hates. Need to find all sentences in the KB that unify with Knows (John,x) and then apply the unifier to Hates (John,x) Remember that x, y, and z are universally quantified NOTE: note that unify(knows(John,x),knows(x,Jane)) fails, because x can’t take on both the value John and the value Jane. But intuitively we know that everyone John knows he hates and everyone knows Jane so we should be able to infer that John hates Jane. This is why we require (in conversion to clausal form) that every variable, if possible, have a separate name. Knows(John,x) and Knows(w, Jane) works.

Unification (example) cont.
Answers: Hates(John,x) with {x/v, v/Jim} (i.e. John hates Jim) Hates(John,x) with {x/v, y/John, v/Leo} (i.e. John hates Leo) Hates(John,x) with {x/v, v/Mother(z), z/John} (i.e. John hates his mother) A more complicated example: Want to use the KB to find out who John hates. Need to find all sentences in the KB that unify with Knows (John,x) and then apply the unifier to Hates (John,x) Remember that x, y, and z are universally quantified NOTE: note that unify(knows(John,x),knows(x,Jane)) fails, because x can’t take on both the value John and the value Jane. But intuitively we know that everyone John knows he hates and everyone knows Jane so we should be able to infer that John hates Jane. This is why we require (in conversion to clausal form) that every variable, if possible, have a separate name. Knows(John,x) and Knows(w, Jane) works.

Converting More Complicated Sentences to CNF
Example we’ve seen is really simple. To handle harder proofs, we need to understand various manipulations that transform arbitrary logic expressions into a form that enables resolution. Resolution only deals with axioms in clause form axioms that are all disjunctions of literals. An axiom involving bricks will illustrate some of the manipulations. Meaning of Sentence: First, bricks are on something else that is not a pyramid; Second, there is nothing that a brick is on and that is on the brick as well. Third, there is nothing that is not a brick and also is the same thing as the brick. As given, the axiom can’t be used to produce resolvents because it’s not in clause form. Convert it into one or more equivalent axioms in clause form. Will lead to 4 new axioms and requires the introduction of another function, support.

Algorithm: Putting Axioms into Clause Form
Eliminate the implications. Move the negations to the atomic formulas. Eliminate the existential quantifiers. Rename the variables, if necessary. Move the universal quantifiers to the left. Move the disjunctions down to the literals. Eliminate the conjunctions. Eliminate the universal quantifiers. Still need to deal with variables during resolution. Unification: needed to match variables and terms between clauses that look similar. See R&N p

1. Eliminate Implications
Substitute (E1  E2) for (E1 → E2) We’ve seen this one.

2. Move negations down to the atomic formulas
Equivalence Transformations: Result: Doing this step requires a number of identities, one for dealing with the negation of & expressions, one for V expressions, one for \neg expressions and one each for \forall and \exists. For the example, we need the third identity, which eliminates the double negations. We also need the final identity, which eliminates and \exists and introduces another \forall

3. Eliminate Existential Quantifiers: Skolemization
Harder cases: There is one argument for each universally quantified variable whose scope contains the Skolem function. Easy case: This step requires eliminating existential quantifiers. It’s the most complicated step. A little obscure. Think about what the expressions mean. A formula that contains an existentially quantified variable asserts that there is a value that can be substituted for the variable that makes the formula true. First examples. Given person x, you’ll be able to identify a person y that makes farther-of (y,x). So, there is a function that takes X as an argument and returns a proper Y. You may not know how the function works, but such a function must exist. We can eliminate the quantifier by substituting for the variable a reference to a function that produces the desired value. Using the new function, we don’t have to say that y exists because we have a way of producing the proper Y in any circumstance. Can rewrite expressions. These generated functions are called Skolem functions; We make no assertions about these functions except that they must exist. Skolem function depends on a particular X. The general rule is that the universal quantifiers determine which arguments Skolem functions need: There is one argument… If existential quantifiers occur within the scope of universal quantifiers, then the value that satisfies the predicate may depend on the values of the universally quantified variables. For example: \forall x, \exists y: father-of(y,x) the value of y that satisfies father-of depends on the particular value of x. Thus we must generate functions with the same number of arguments as the number of universal quantifiers in whose scope the expression occurs. Easier case. Skolem constants (skolem function of no arguments). In this case, the function S1 has no arguments and we assume that it returns the name of presidents.

4. Rename variables as necessary
We want no two variables of the same name. Quantifiers don’t care what their variable names are. So we can rename any duplicates so that each quantifier has a unique name. Why do we want to do this? Because we’ll want to move all the universal quantifiers together at the left of each expression in the next step and we don’t want that to cause any confusion.

5. Move the universal quantifiers to the left
This works because each quantifier uses a unique variable name. 1. This works because each quantifier uses a unique variable name.

6. Move disjunctions down to the literals
1. Uses one of the distributive laws.

7. Eliminate the conjunctions
Well…don’t really eliminate them. Just write each part as though it were a separate axiom. This makes sense because each part of a conjunction must be true if the whole conjunction is true.

8. Rename all variables, as necessary, so no two have the same name
We’ve done this kind of thing before. It won’t cause a problem here because we’re just renaming the universally quantified variables in each part of a conjunction. Because each of the conjoined parts must be true for any variable values, it doesn’t matter whether the variables have different names for each part.

9. Eliminate the universal quantifiers
Again, we don’t really eliminate them. We just adopt the convention that all variables are presumed to be universally quantified. Voila! Clause form. Each is a disjunction of literals. Taking the whole set together, we have an implied AND on the top level, literals on the bottom level and ORs in between. Each clause’s variables are different, and all vars are implicitly universally quantified.

Algorithm: Putting Axioms into Clausal Form
Eliminate the implications. Move the negations to the atomic formulas. Eliminate the existential quantifiers. Rename the variables, if necessary. Move the universal quantifiers to the left. Move the disjunctions down to the literals. Eliminate the conjunctions. Eliminate the universal quantifiers. Still need to deal with variables during resolution. Unification: needed to match variables and terms between clauses that look similar. See R&N p

Resolution Proofs as Search
Search Problem States: Content of knowledge base in CNF Initial state: KB with negated theorem to prove Successor: Resolution inference rule with unify Goal test: Does knowledge base contain the empty clause ’nil’ Search Algorithm Depth first search (used in PROLOG) Note: Possibly infinite state space

Strategies for Selecting Clauses
unit-preference strategy: Give preference to resolutions involving the clauses with the smallest number of literals. set-of-support strategy: Try to resolve with the negated theorem or a clause generated by resolution from that clause. subsumption: Eliminates all sentences that are subsumed (i.e., more specific than) an existing sentence in the KB. May still require exponential time. We know that repeated applications of the resolution inference rule will find a proof if one exists, but still some practical issues…computational complexity. No guarantee of the efficiency of this process. Obviously, we have many choices in resolution theorem proving. The wrong choice can make the proof quite long. These are strategies/heuristics for helping choose which clauses to unify. There are a few strategies for guiding the proof. An obvious strategy: Only resolve pairs containing complementary literals. Index the clauses by predicates that they contain and whether they are negated. Then possible complementary clauses can be indexed directly. Give preference to…Try to reduce the number of literals because ultimately you’re trying for zero terms in the contradiction. Unit-preference strategy Try to resolve…This will encourage progress toward the goal of contradicting that clause. This is called the set-of-support strategy. Any contradiction other than the one being refuted would indicate an inconsistency in the axioms. Eliminate all… These clauses are not usefully resolved. E.g., if P9x) is in the KB, then there is no sense in adding P(A) and less sense in adding P(A) \vee Q(B). Subsumption helps to keep the KB (and hence, the search space) small. Breadth-first strategy: resolves all possible pairs of the initial clauses, then resolves all possible pairs of the resulting set together w/ the initial set, etc. All of the above strategies are considered complete. Guaranteed to find a proof if the theorem logically follows from the axioms. All resolution search strategies are subject to the exponential-explosion problem and to a version of the halting problem. Still require exponential amount of search.

Example Jack owns a dog. Every dog owner is an animal lover.
No animal lover kills an animal. Either Jack or Curiosity killed the cat, who is named Tuna. Did Curiosity kill the cat?

Original Sentences (Plus Background Knowledge)

Conjunctive Normal Form
D is a placeholder for the dogs unknown name (i.e. Skolem symbol/function). Add negation of theorem to KB: Kills (Curiosity, Tuna). Proof by resolution on board Add negation of theorem to KB

Proof by Resolution kills(Curiosity,Tuna)
kills(Jack,Tuna)  kills(Curiosity,Tuna) {} kills(Jack,Tuna) AnimalLover(w)  Animal(y)  kills(w,y) {w/Jack, y/Tuna} Animal(z)  Cat(z) AnimalLover(Jack)  Animal(Tuna) {z/Tuna} AnimalLover(Jack)  Cat(Tuna) Cat(Tuna) {} 1. Not in CNF AnimalLover(Jack) Dog(y)  Owns(x,y)  AnimalLover(x) {x/Jack} Dog(D) Dog(y)  Owns(Jack,y) Owns(Jack,D) {y/D} Owns(Jack,D) NIL

Proofs can be Lengthy A relatively straightforward KB can quickly overwhelm general resolution methods. Resolution strategies reduce the problem somewhat, but not completely. As a consequence, many practical Knowledge Representation formalisms in AI use a restricted form and specialized inference. Logic programming (Prolog) Production systems Frame systems and semantic networks Description logics Schubert Steamroller: Wolves, foxes, birds, caterpillars, and snails are animals and there are some of each of them. Also there are some grains, and grains are plants. Every animal either likes to eat all plants or all animals much smaller than itself that like to eat some plants. Caterpillars and snails are much smaller than birds, which are much smaller than foxes, which are much smaller than wolves. Wolves do not like to eat foxes or grains, while birds like to eat caterpillars but not snails. Caterpillars and snails like to eat some plants. Prove: there is an animal that likes to eat a grain-eating animal. 3. Some of the necessary logical forms: $\forall x (Wolf(x) → animal(x)) $\forall x \forall y ((Caterpillar(x) \wedge Bird(y)) → Smaller(x,y). $\exists x bird(x) 4. Requires almost 150 resolution steps (minimal). Significant challenge for early systems. 5. Open for about 15 years; solved in late 80s 6. To slide 7. Can often understand them in terms of standard first-order logic! (clear syntax & semantics).

Back to diagnosis Conflict: SD  OBS  {¬AB(C1)…¬AB(Ck)} ⊢⊥
The motivation to use resolution prover is to find conflicts Conflict: SD  OBS  {¬AB(C1)…¬AB(Ck)} ⊢⊥ By resolution prover we can find which components make the inconsistence Challenge: minimal conflict

Resolution Theorem Prover in First-Order Logic

Similar presentations

Presentation on theme: "Resolution Theorem Prover in First-Order Logic"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Resolution Theorem Prover in First-Order Logic

Similar presentations

Presentation on theme: "Resolution Theorem Prover in First-Order Logic"— Presentation transcript:

Similar presentations

About project

Feedback