17.5 Rule Learning Given the importance of rule-based systems and the human effort that is required to elicit good rules from experts, it is natural to.

Slides:



Advertisements
Similar presentations
Artificial Intelligence
Advertisements

Explanation-Based Learning (borrowed from mooney et al)
Introductory Mathematics & Statistics for Business
Tests of Hypotheses Based on a Single Sample
Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
Rule-Based Classifiers. Rule-Based Classifier Classify records by using a collection of “if…then…” rules Rule: (Condition)  y –where Condition is a conjunctions.
CHAPTER 13 Inference Techniques. Reasoning in Artificial Intelligence n Knowledge must be processed (reasoned with) n Computer program accesses knowledge.
Automated Reasoning Systems For first order Predicate Logic.
1 CHAPTER 4 RELATIONAL ALGEBRA AND CALCULUS. 2 Introduction - We discuss here two mathematical formalisms which can be used as the basis for stating and.
1 Logic Logic in general is a subfield of philosophy and its development is credited to ancient Greeks. Symbolic or mathematical logic is used in AI. In.
Logic Use mathematical deduction to derive new knowledge.
MANAGERIAL ACCOUNTING
Knowledge-Based Systems. Page 2 === Rule-Based Expert Systems n Expert Systems – One of the most successful applications of AI reasoning technique using.
Artificial Intelligence Chapter 17 Knowledge-Based Systems Biointelligence Lab School of Computer Sci. & Eng. Seoul National University Part 2/2.
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
Formal Logic Mathematical Structures for Computer Science Chapter 1 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesFormal Logic.
Algebra Problems… Solutions Algebra Problems… Solutions © 2007 Herbert I. Gross Set 4 By Herb I. Gross and Richard A. Medeiros next.
Chapter 12: Expert Systems Design Examples
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
CSE (c) S. Tanimoto, 2008 Propositional Logic
Covering Algorithms. Trees vs. rules From trees to rules. Easy: converting a tree into a set of rules –One rule for each leaf: –Antecedent contains a.
Administrative stuff On Thursday, we will start class at 11:10, and finish at 11:55 This means that each project will get a 10 minute presentation + 5.
Let remember from the previous lesson what is Knowledge representation
Lesson 6. Refinement of the Operator Model This page describes formally how we refine Figure 2.5 into a more detailed model so that we can connect it.
EE1J2 – Discrete Maths Lecture 5 Analysis of arguments (continued) More example proofs Formalisation of arguments in natural language Proof by contradiction.
Propositional Calculus Math Foundations of Computer Science.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Business Statistics: Communicating with Numbers
Rule-Based Fuzzy Model. In rule-based fuzzy systems, the relationships between variables are represented by means of fuzzy if–then rules of the following.
The Marriage Problem Finding an Optimal Stopping Procedure.
COLLECTING QUANTITATIVE DATA: Sampling and Data collection
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
Systems Architecture I1 Propositional Calculus Objective: To provide students with the concepts and techniques from propositional calculus so that they.
Proof Systems KB |- Q iff there is a sequence of wffs D1,..., Dn such that Dn is Q and for each Di in the sequence: a) either Di is in KB or b) Di can.
Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances This presentation focuses (like my course) on MUS. It omits the effect.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
MATH 224 – Discrete Mathematics
CS Learning Rules1 Learning Sets of Rules. CS Learning Rules2 Learning Rules If (Color = Red) and (Shape = round) then Class is A If (Color.
LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.
Department of Electrical Engineering, Southern Taiwan University Robotic Interaction Learning Lab 1 The optimization of the application of fuzzy ant colony.
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
HAWKES LEARNING Students Count. Success Matters. Copyright © 2015 by Hawkes Learning/Quant Systems, Inc. All rights reserved. Section 1.1 Thinking Mathematically.
Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
Use Case Driven Analysis Requirements Use Case Use Case Description System Sequence Diagram Chapter 5.
For Monday Finish chapter 19 No homework. Program 4 Any questions?
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
Automated Reasoning Systems For first order Predicate Logic.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Computer Science CPSC 322 Lecture 22 Logical Consequences, Proof Procedures (Ch 5.2.2)
Of 38 lecture 13: propositional logic – part II. of 38 propositional logic Gentzen system PROP_G design to be simple syntax and vocabulary the same as.
Learning and Acting with Bayes Nets Chapter 20.. Page 2 === A Network and a Training Data.
IE241 Final Exam. 1. What is a test of a statistical hypothesis? Decision rule to either reject or not reject the null hypothesis.
Computational Learning Theory Part 1: Preliminaries 1.
1 Lecture 3 The Languages of K, T, B and S4. 2 Last time we extended the language PC to the language S5 by adding two new symbols ‘□’ (for ‘It is necessary.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Metalogic Soundness and Completeness. Two Notions of Logical Consequence Validity: If the premises are true, then the conclusion must be true. Provability:
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
1 Section 7.1 First-Order Predicate Calculus Predicate calculus studies the internal structure of sentences where subjects are applied to predicates existentially.
Artificial Intelligence Logical Agents Chapter 7.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Copyright © Cengage Learning. All rights reserved.
Chapter 7. Classification and Prediction
Knowledge Representation and Reasoning
Rule Induction for Classification Using
CS 9633 Machine Learning Inductive-Analytical Methods
Artificial Intelligence Chapter 17 Knowledge-Based Systems
Implementation of Learning Systems
Presentation transcript:

17.5 Rule Learning Given the importance of rule-based systems and the human effort that is required to elicit good rules from experts, it is natural to ask whether or not expert system rules could be learned automatically. There are two major types of learning, inductive and deductive. Both types can be used in learning rules. Neural-network learning, for example, is inductive because the functions learned are hypotheses about some underlying and unknown function. In successful learning, the hypotheses typically give correct outputs for most inputs, but they might also err. Inductive rule-learning methods create new rules about a domain- not derivable from any previous rules. I present methods for inductively learning rules in both the propositional and the predicate calculus. Deductive rule learning enhances the efficiency of a system's performance by deducing additional rules from previously known domain rules and facts. The conclusions that the system can derive using these additional rules could also have been derived without them. But with the additional rules, the system might perform more efficiently. I will explain a technique called explanation-based generalization (EBG) for deducing additional rules Learning Propositional Calculus Rules Several methods for inductive rule learning have been proposed; I describe one of them here. I first describe the general idea for propositional Horn clause logic. Then, I show how a similar technique can be used to learn first-order logic Horn clause rules. To frame my discussion, I use again my simple example of approving a bank loan. Instead of being given rules for this problem, suppose we are given a training set consisting of the values of attributes for a large number of individuals. To illustrate, consider the data given in Table 17.1 (I use 1 for True and 0 for False). This table might be compiled, for example, from records of loan applications and the decisions made by human loan officers. Members of the training set for which the value of OK is 1 are called positive instances; members for which the value of OK is 0 are called negative instances. From the training set, we desire to induce rules of the form

α 1 ∧ α 2 ∧ … α n ⊃ OK where the α 1 are propositional parameters from the set {APP, RATING, INC, BAL}. If the antecedent of a rule has value True for an instance in the training set, we say that the rule covers that instance. We can change any existing rule to make it cover fewer instances by adding an parameter to its antecedent. Such a change makes the rule more specific. Two rules can cover more instances than can one alone. Adding a rule makes the system using these rules more general. We seek a set of rules that covers all and only the positive instances in the training set. Searching for a set of rules can be computationally difficult. I describe a “greedy" method, which I call separate and conquer. We first attempt to find a single rule that covers only positive instances - even if it doesn't cover all the positive instances. We search for such a rule by starting with a rule that covers all instances (positive and negative), and we gradually make it more specific by adding parameters to its antecedent. Since a single rule might not cover all the positive instances, we gradually add rules (making them as specific as needed as we go) until the entire set of rules covers all and only the positive instances. Here is how the method works for our example. We start with the provisional rule T ⊃ OK which covers all instances. Now we must add an parameter to make it cover fewer negative instances-working toward covering only positive ones. Which parameter (from the set {APP, RATING, INC, BAL}) should we add? Several criteria have been used for making the selection. To keep my discussion simple, I will base our decision on an easy-to-calculate ratio: r α = n α + /n α

where n α is the total number of (positive and negative) instances covered by the α parameter, and n α + is the total number of positive instances covered by the α parameter. We select that α yielding the largest value of r α. In our case, the values are r APP = 3/6 = 0.5 r RATING = 4/6 = r INC = 3/6 = 0.5 r BAL = 3/4 = 0.75 So, we select BAL, yielding the provisional rule BAL ⊃ OK This rule covers the positive instances 3, 4, and 7, but also covers the negative instance 1, so we must specialize it further. We use the same technique to select another parameter. The calculations for the rα's must now take into account the fact that we have already decided that the first component in the antecedent is BAL: r APP = 2/3 = r RATING = 3/3 = 1.0 r INC = 2/2 = 1.0 Table 17.2 IndividualAPPRATINGINCBALOK

Here we have a tie between RATING and INC. We might select RATING because r RATING is based on a larger sample. (You should explore the consequences of selecting INC instead.) The rule BAL Λ RATING ⊃ OK covers only positive instances, so we do not need to add further parameters to the antecedent of this rule. But this rule does not cover all of the positive instances. Specifically, it does not cover positive instance 6. So, we must add another rule. To learn the next rule, we first eliminate from the table all of the positive instances already covered by the first rule, to obtain the data shown in Table We begin the process all over again with this reduced table, starting with the rule T ⊃ OK. This rule covers some negative instances, namely, 1, 2, 5, 8, and 9. To select an parameter to add to the antecedent, we calculate Reduced Data r APP = 1/4 = 0.25 r RATING = 1/3 = r INC = 1/4 = 0.25 r BAL = 0/1 = 0.0 Again, a tie. Let's arbitrarily select APP to give us the rule RATING ⊃ OK. This rule covers negative Instances 5 and 9, so we must add another parameter TABLE 17.3 to the antecedent. r APP = 1/2 = 0.5 r INC = 1/2 = 0.5 r BAL = 0/0 = 0.0 IndividualAPPRATINGINCBALOK TABLE 17.4

We select APP to give us the rule APP Λ RATING ⊃ OK. This rule covers negative instance 9. So we must add another parameter to the antecedent. The reduced data is shown in table 17.5 below: r INC = 1/1 = 1 r BAL = 0/0 = 0.0 Table 17.5 We select INC to make the rule more specific. Making this rule yet more specific finally results in the rule APP Λ RATING Λ INC ⊃ OK. These two rules, namely, BAL Λ RATING ⊃ OK and APP Λ RATING Λ INC ⊃ OK cover all and only the positive instances, so we are finished. IndividualAPPRATINGINCBALOK