Covering Algorithms. Trees vs. rules From trees to rules. Easy: converting a tree into a set of rules –One rule for each leaf: –Antecedent contains a.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Data Mining Classification: Alternative Techniques
1 CS 391L: Machine Learning: Rule Learning Raymond J. Mooney University of Texas at Austin.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Rule-Based Classifiers. Rule-Based Classifier Classify records by using a collection of “if…then…” rules Rule: (Condition)  y –where Condition is a conjunctions.
From Decision Trees To Rules
BackTracking Algorithms
RIPPER Fast Effective Rule Induction
Decision Tree Approach in Data Mining
1 Discrete Structures & Algorithms Graphs and Trees: III EECE 320.
Classification Techniques: Decision Tree Learning
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
Primer Parcial -> Tema 3 Minería de Datos Universidad del Cauca.
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.
Instance-based representation
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Decision Tree Algorithm
Basic Data Mining Techniques
Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.
Basic Data Mining Techniques
17.5 Rule Learning Given the importance of rule-based systems and the human effort that is required to elicit good rules from experts, it is natural to.
Next Generation Techniques: Trees, Network and Rules
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Machine Learning Chapter 3. Decision Tree Learning
Slides for “Data Mining” by I. H. Witten and E. Frank.
Chapter 6: Implementations. Why are simple methods not good enough? Robustness: Numeric attributes, missing values, and noisy data.
Mining Optimal Decision Trees from Itemset Lattices Dr, Siegfried Nijssen Dr. Elisa Fromont KDD 2007.
Mohammad Ali Keyvanrad
Computational Intelligence: Methods and Applications Lecture 30 Neurofuzzy system FSM and covering algorithms. Włodzisław Duch Dept. of Informatics, UMK.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.4: Covering Algorithms Rodney Nielsen Many.
Bab 5 Classification: Alternative Techniques Part 1 Rule-Based Classifer.
CS690L Data Mining: Classification
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Class1 Class2 The methods discussed so far are Linear Discriminants.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.2: Classification Rules Rodney Nielsen Many.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.
RULE-BASED CLASSIFIERS
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
Classification Algorithms Covering, Nearest-Neighbour.
Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 3 Basic Data Mining Techniques Jason C. H. Chen, Ph.D. Professor of MIS School of Business.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
Data Mining Chapter 4 Algorithms: The Basic Methods - Constructing decision trees Reporter: Yuen-Kuei Hsueh Date: 2008/7/24.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Machine Learning in Practice Lecture 18
DECISION TREES An internal node represents a test on an attribute.
Decision Trees an introduction.
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Artificial Intelligence
Data Science Algorithms: The Basic Methods
Data Science Algorithms: The Basic Methods
PC trees and Circular One Arrangements
MIS2502: Data Analytics Classification using Decision Trees
Machine Learning Chapter 3. Decision Tree Learning
Classification Algorithms
Machine Learning Chapter 3. Decision Tree Learning
Statistical Learning Dong Liu Dept. EEIS, USTC.
Machine Learning in Practice Lecture 17
Decision trees One possible representation for hypotheses
Decision Trees Jeff Storey.
Data Mining CSCI 307, Spring 2019 Lecture 9
Presentation transcript:

Covering Algorithms

Trees vs. rules From trees to rules. Easy: converting a tree into a set of rules –One rule for each leaf: –Antecedent contains a condition for every node on the path from the root to the leaf –Consequent is the class assigned by the leaf From rules to trees More difficult: transforming a rule set into a tree –Tree cannot easily express disjunction between rules Example: If a and b then x If c and d then x –Corresponding tree contains identical subtrees (  “replicated subtree problem”)

A tree for a simple disjunction

Covering algorithms Strategy for generating a rule set directly: –for each class in turn find a rule set that covers all instances in it (excluding instances not in the class) This approach is called a covering approach because at each stage a rule is identified that covers some of the instances

Example: generating a rule Possible rule set for class “b”: More rules could be added for “perfect” rule set If x  1.2 then class = b If x > 1.2 and y  2.6 then class = b

A simple covering algorithm Generates a rule by adding tests that maximize rule’s accuracy Similar to situation in decision trees: problem of selecting an attribute to split on. –But: decision tree inducer maximizes overall purity Each new test reduces rule’s coverage.

Selecting a test Goal: maximizing accuracy –t: total number of instances covered by rule –p: positive examples of the class covered by rule –t-p: number of errors made by rule  Select test that maximizes the ratio p/t We are finished when p/t = 1 or the set of instances can’t be split any further

Example: contact lenses data

The numbers on the right show the fraction of “correct” instances in the set singled out by that choice. In this case, correct means that their recommendation is “hard.”

Modified rule and resulting data The rule isn’t very accurate, getting only 4 out of 12 that it covers. So, it needs further refinement.

Further refinement

Modified rule and resulting data Should we stop here? Perhaps. But let’s say we are going for exact rules, no matter how complex they become. So, let’s refine further.

Further refinement

The result

Pseudo-code for PRISM For each class C Initialize E to the instance set While E contains instances in class C Create a rule R with an empty left-hand side that predicts class C Until R is perfect (or there are no more attributes to use) do For each attribute A not mentioned in R, and each value v, Consider adding the condition A = v to the left-hand side of R Select A and v to maximize the accuracy p/t (break ties by choosing the condition with the largest p) Add A = v to R Remove the instances covered by R from E

Separate and conquer Methods like PRISM (for dealing with one class) are separate-and-conquer algorithms: –First, a rule is identified –Then, all instances covered by the rule are separated out –Finally, the remaining instances are “conquered” Difference to divide-and-conquer methods: –Subset covered by rule doesn’t need to be explored any further