1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.

Slides:



Advertisements
Similar presentations
Classification and Prediction
Advertisements

Data Mining Lecture 9.
Paper By - Manish Mehta, Rakesh Agarwal and Jorma Rissanen
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Data Mining Techniques: Classification. Classification What is Classification? –Classifying tuples in a database –In training set E each tuple consists.
IT 433 Data Warehousing and Data Mining
Decision Tree Approach in Data Mining
Classification Techniques: Decision Tree Learning
Lecture outline Classification Decision-tree classification.
Data Mining: Concepts and Techniques (2nd ed.) — Chapter 6 —
Classification and Prediction
Classification & Prediction
Classification and Prediction
About ISoft … What is Decision Tree? Alice Process … Conclusions Outline.
1 Classification with Decision Trees I Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Classification Continued
1 DATA MINING. 2 Introduction Outline Define data mining Data mining vs. databases Basic data mining tasks Data mining development Data mining issues.
Classification II.
Classification and Prediction
Classification.
Chapter 4 Classification and Scoring
Chapter 7 Decision Tree.
1 Data Mining: Concepts and Techniques (3 rd ed.) — Chapter 8 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign &
Data Mining: Classification
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
Advanced Database Technologies
Mohammad Ali Keyvanrad
11/9/2012ISC471 - HCI571 Isabelle Bichindaritz 1 Classification.
Chapter 8 Discriminant Analysis. 8.1 Introduction  Classification is an important issue in multivariate analysis and data mining.  Classification: classifies.
Basic Data Mining Technique
Decision Trees. 2 Outline  What is a decision tree ?  How to construct a decision tree ? What are the major steps in decision tree induction ? How to.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Feature Selection: Why?
Privacy preserving data mining Li Xiong CS573 Data Privacy and Anonymity.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification CS 685: Special Topics in Data Mining Fall 2010 Jinze Liu.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Classification and Prediction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber.
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Classification And Bayesian Learning
Classification and Prediction
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Decision Trees.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
1 March 9, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 4 — Classification.
Decision Tree. Classification Databases are rich with hidden information that can be used for making intelligent decisions. Classification is a form of.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
By N.Gopinath AP/CSE.  A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on an attribute, each.
DECISION TREE INDUCTION CLASSIFICATION AND PREDICTION What is classification? what is prediction? Issues for classification and prediction. What is decision.
Chapter 6 Decision Tree.
DECISION TREES An internal node represents a test on an attribute.
Decision Trees.
Chapter 6 Classification and Prediction
Information Management course
Classification and Prediction
Classification by Decision Tree Induction
Data Mining: Concepts and Techniques
CS 685: Special Topics in Data Mining Jinze Liu
Data Mining – Chapter 3 Classification
Classification and Prediction
CS 685: Special Topics in Data Mining Jinze Liu
CSCI N317 Computation for Scientific Applications Unit Weka
Classification and Prediction
©Jiawei Han and Micheline Kamber
Classification.
CS 685: Special Topics in Data Mining Spring 2009 Jinze Liu
CS 685: Special Topics in Data Mining Jinze Liu
Classification 1.
Presentation transcript:

1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values in a classifying attribute and uses it in classifying new data Prediction: models continuous-valued functions, i.e., predicts unknown or missing values Typical Applications credit approval target marketing medical diagnosis treatment effectiveness analysis Classification and Prediction

2 Classification—A Two-Step Process Model construction and model usage Model construction: describing a set of predetermined classes Each tuple/sample is assumed to belong to a predefined class, as determined by the class label attribute The set of tuples used for model construction is training set The model is represented as classification rules, decision trees, or mathematical formulas

3 Classification—A Two-Step Process Model usage: for classifying future or unknown objects Estimate accuracy of the model The known label of test sample is compared with the classified result from the model Accuracy rate is the percentage of test set samples that are correctly classified by the model Test set is independent of training set If the accuracy is acceptable, use the model to classify data tuples whose class labels are not known

4 Classification Process: Model Construction and Use the Model in Prediction

5 Classification Process (1): Model Construction Training Data Classification Algorithms IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ Classifier (Model)

6 Classification Process (2): Use the Model in Prediction Classifier Test Data New Data (Jeff, Professor, 4) Tenured?

7 Supervised vs. Unsupervised Learning Supervised learning (classification) Supervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observations New data is classified based on the training set Unsupervised learning (clustering) The class labels of training data is unknown Given a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data

8 Issues Regarding Classification and Prediction: Data Preparation Data cleaning Preprocess data in order to reduce noise and handle missing values Relevance analysis (feature selection) Remove the irrelevant or redundant attributes Data transformation Generalize and/or normalize data

9 Issues Regarding Classification and Prediction: Evaluating Classification Methods Predictive accuracy Speed and scalability time to construct the model time to use the model Robustness handling noise and missing values Scalability efficiency in disk-resident databases Interpretability: understanding and insight provided by the model Goodness of rules decision tree size compactness of classification rules

10 Training Dataset

11 Output: A Decision Tree for “buys_computer” age? overcast student?credit rating? noyes fair excellent <=30 >40 no yes

12 Algorithm for Decision Tree Induction Basic algorithm (a greedy algorithm) Tree is constructed in a top-down recursive divide-and- conquer manner At start, all the training examples are at the root Attributes are categorical (if continuous-valued, they are discretized in advance) Examples are partitioned recursively based on selected attributes Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain)

13 Algorithm for Decision Tree Induction Conditions for stopping partitioning All samples for a given node belong to the same class There are no remaining attributes for further partitioning There are no samples left

14 Decision-Tree Classification (1) create a node N; (2) if samples are of the same class, C, then (3) return N as a leaf node labeled with the class C; (4) if attribute-list is empty then (5) return N as a leaf node labeled with the most common class in samples; (6) select test-attribute, the attribute among attribute-list with the highest information gain; (7) label node N with test-attribute;

15 Decision-Tree Classification (8) for each known value a i of test-attribute (9) grow a branch from node N for the condition test- attribute=a i ; (10) let s i be the set of samples in samples for which testattribute=a i ;// a partition (11) if s i is empty then (12) attach a leaf labeled with the most common class in samples; (13) else attach the node returned by Generate_decision_tree(s i, attribute-list-best- attribute);

16 Decision-Tree Classification

17 Choose Split Attribute The attribute selection measure is also called Goodness function Different algorithms may use different goodness functions: – information gain – gini index – inference power

18 Primary Issues in Tree Construction Branching scheme: Determining the tree branch to which a sample belongs When to stop the further splitting of a node Labeling rule: a node is labeled as the class to which most samples at the node belongs

19 How to Use a Tree? Directly test the attribute value of unknown sample against the tree. A path is traced from root to a leaf which holds the label Indirectly decision tree is converted to classification rules one rule is created for each path from the root to a leaf IF-THEN is easier for humans to understand

20 Information Gain Select the attribute with the highest information gain S contains s i tuples of class C i for i = {1, …, m} information measures information required to classify any arbitrary tuple entropy of attribute A with values {a 1,a 2,…,a v } information gained by branching on attribute A

21 Attribute Selection by Information Gain Computation  Class P: buys_computer = “yes”  Class N: buys_computer = “no”  Information

22 Attribute Selection by Information Gain Computation  Compute the entropy for age:  Where means “age <=30” has 5 out of 14 samples, with 2 yes’es and 3 no’s.  Hence  Similarly,

23 Attribute Selection by Information Gain Computation age? overcast student?credit rating? noyes fair excellent <=30 >40 no yes