More Symbolic Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Slides:



Advertisements
Similar presentations
Explanation-Based Learning (borrowed from mooney et al)
Advertisements

Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
CHAPTER 13 Inference Techniques. Reasoning in Artificial Intelligence n Knowledge must be processed (reasoned with) n Computer program accesses knowledge.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
Knowledge Representation and Reasoning Learning Sets of Rules and Analytical Learning Harris Georgiou – 4.
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
Knowledge in Learning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 19 Spring 2004.
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
Knowledge in Learning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 19 Spring 2005.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 21 Jim Martin.
Learning from Observations Chapter 18 Section 1 – 4.
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
Knowledge in Learning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 21.
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Fall 2004.
Ensemble Learning: An Introduction
Machine Learning: Symbol-Based
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Chapter 11 Multiple Regression.
Machine Learning: Ensemble Methods
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Machine Learning CS 165B Spring 2012
Issues with Data Mining
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
Inductive learning Simplest form: learn a function from examples
Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Slides for “Data Mining” by I. H. Witten and E. Frank.
CS 478 – Tools for Machine Learning and Data Mining The Need for and Role of Bias.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
CS 391L: Machine Learning: Ensembles
For Friday No reading No homework. Program 4 Exam 2 A week from Friday Covers 10, 11, 13, 14, 18, Take home due at the exam.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
Decision Tree Learning R&N: Chap. 18, Sect. 18.1–3.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving Machine Learning: Symbol-Based Luger: Artificial.
Learning, page 19 CSI 4106, Winter 2005 Learning decision trees A concept can be represented as a decision tree, built from examples, as in this problem.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
Learning, page 1 CSI 4106, Winter 2005 Symbolic learning Points Definitions Representation in logic What is an arch? Version spaces Candidate elimination.
1 Inductive Learning (continued) Chapter 19 Slides for Ch. 19 by J.C. Latombe.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
For Monday Finish chapter 19 No homework. Program 4 Any questions?
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
CS Inductive Bias1 Inductive Bias: How to generalize on novel data.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
Machine Learning Concept Learning General-to Specific Ordering
CpSc 810: Machine Learning Analytical learning. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various.
Classification Ensemble Methods 1
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Learning From Observations Inductive Learning Decision Trees Ensembles.
Web-Mining Agents First-Order Knowledge in Learning Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Karsten Martiny (Übungen)
1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.
Ensemble Classifiers.
Machine Learning: Ensemble Methods
Chapter 7. Classification and Prediction
CS 9633 Machine Learning Inductive-Analytical Methods
Data Mining Lecture 11.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Ensemble learning.
Knowledge in Learning Chapter 19
A task of induction to find patterns
A task of induction to find patterns
Presentation transcript:

More Symbolic Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College

Ensemble Learning Instead of learning exactly one hypothesis, learn several and combine them Example –Learn 5 hypotheses (classification rules) from the same training set –For each element in the test set, let all hypotheses vote on classification –Result: for an item to be mis-classified, it has to be mis- classified (in the same way) by 3 different hypotheses! Assumption: errors made by each hypothesis are independent (or at least, different)

Multiple Simple Hypotheses Simple hypotheses –Require less time to compute –Require less space to represent –Limit the “expressiveness” of what can be learned Multiple simple hypotheses (fixed number) –Don’t significantly increase time or space needs –Can significantly increase expressiveness Example (next slide): Multiple linear thresholds

Multiple Linear Threshold Hypotheses A point must be on the correct side of all lines to be classified + (blue)

Boosting Generate M hypotheses sequentially, from “weighted training sets” First set has equal weights. Second set has misclassified examples from first set weighted higher, correctly classified weighted lower. Third set revises weights based on classified/misclassified examples from second set, etc. Result is weighted majority classification of all hypotheses, weighted according to how they did on the training set.

Boosting Training Data 1.Foreign, small, red + 2.Domestic, large green – 3.Foreign, small, blue – 4.Domestic, small, red + 5.Foreign, large, green – 6.Foreign, large, red –

Boosting Example example123456Error, Z Weight 1/6 H1:small ++–+++ 1/6 log 5 weight 1/10 1/21/10 H2: red +++++– 1/10 log 9 Weight 1/18 5/181/18 1/2 H3: dom ––++++ 1/9 log 8

Test Classifications Rules: small (log 5), red (log 9), dom (log 8) Domestic, large, blue –NO (log 5 + log 9), YES (log 8) --> NO Domestic, large, red –NO(log 5), YES (log 9 + log 8) --> YES Foreign, small, green –NO(log 8 + log 9), YES (log 5) --> NO

Knowledge in Learning What you learn depends on what you know –Attributes for classification –Organizational structure Relationships in structural learning (e.g. above, next to, touching) –Relevance Copper wire {conducts electricity / is 2m long}

Explanation Based Learning “Learning” from one example Extract a general rule from an example –A penny conducts electricity –(Pennies are made of copper) –Therefore, copper conducts electricity Generalization of optimization technique called memoization –Build table of results from prior computations

Steps in EBL 1.Build an explanation of the observation. For example, prove that the instance is a member of a query class (goal predicate) 2.In parallel, construct a generalized proof tree, using the same inference steps (rules) 3.Transform the generalized proof tree into a new rule. This generalizes the new explanation.

Operationality Define an “easy” subgoal as operational Use operationality to know when to stop the proof Examples: –Operational concept is a structural constraint that can be easily observed –Operational concept is a LISP primitive that can be easily applied

EBL Example: Learn “cup” Goal: A cup is stable, liftable, and holds liquid Operationality: only structural criteria (geometry, topology) may be used Example: a coffee-cup

Explain the Observation Prove why the object is a cup: –Goal: stable(x) & liftable(x) & holds-liquid(x) -> cup(x) –Facts: flatbottom(cup1) & red(cup1) & attach(cup1,handle) & weight(cup1,2oz) & concave(cup1) & ceramic(cup1) –Proof: Cup is stable because it has a flat bottom; liftable because it has a handle and weighs 2oz (which is less than 16oz), and holds-liquid because it is concave.

Generalize Replace all constants in the proof (e.g. cup1) with variables cup(x) <- flatbottom(x) & attach(x,y) & weight(x,z) & concave(x) Re-prove to find any variables that should really be constants! cup(x) <- flatbottom(x) & attach(x,handle) & weight(x,z) & (< z 16oz) A cup is an object that has a flat bottom, a handle, and weighs < 1lb and is concave

Relevance Based Learning Find simplest determination consistent with the observations –Determination is a set of features P,Q so that if examples match on P, they will also match on Q –In this case, Q is the target predicate (classification to be learned) –Simplest has the fewest attributes Combine with Decision Tree for great improvement –RBL to select a subset of attributes, then DTL

Inductive Logic Programming Combines inductive learning (from examples) with first-order logical representations Allows relational concepts (like grandparent) to be learned, unlike attribute-based methods Is a rigorous approach

Top-Down ILP Start with a very general rule –Analogy: empty decision tree –Actual: empty left side – (empty) => grandfather(X,Y) Gradually specialize the rule until it fits the data –Analogy: growing the decision tree –Actual: adding predicates to left side until all examples are correctly classified father(X,Z) => grandfather(X,Y) father(X,Z) and parent(Z,Y) => grandfather(X,Y)

Inverse Resolution We know that resolution takes two clauses c1 and c2 and generates a result c Inverse resolution starts with c (and possibly c1) and determines c2 Working “backwards” through a proof, generate the necessary rules –This can be done for discovery = unsupervised symbolic learning!

Some Final Comments Learning requires prior knowledge –E.g. which attributes are available? –E.g. current rules for EBL Prior knowledge can bias a learning system –This can be good or bad Existing learning systems are focused; we are a long way from creating a system that learns like an infant.