Spatial and Temporal Data Mining V. Megalooikonomou Introduction to Decision Trees ( based on notes by Jiawei Han and Micheline Kamber and on notes by.

Slides:



Advertisements
Similar presentations
Random Forest Predrag Radenković 3237/10
Advertisements

1 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Data Mining and Data Warehousing  Introduction  Data warehousing and OLAP for data mining.
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
1 Data Mining Classification Techniques: Decision Trees (BUSINESS INTELLIGENCE) Slides prepared by Elizabeth Anglo, DISCS ADMU.
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 – DB Applications Lecture # 24: Data Warehousing / Data Mining (R&G, ch 25 and 26)
SLIQ: A Fast Scalable Classifier for Data Mining Manish Mehta, Rakesh Agrawal, Jorma Rissanen Presentation by: Vladan Radosavljevic.
Lecture Notes for Chapter 4 Introduction to Data Mining

Lecture outline Classification Decision-tree classification.
Decision Tree Rong Jin. Determine Milage Per Gallon.
Classification and Prediction
CSci 8980: Data Mining (Fall 2002)
Decision Tree Algorithm
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Data Warehousing / Data Mining.
Classification Continued
Example of a Decision Tree categorical continuous class Splitting Attributes Refund Yes No NO MarSt Single, Divorced Married TaxInc NO < 80K > 80K.
Classification II.
Classification.
Data Mining – Intro.
Temple University – CIS Dept. CIS616– Principles of Data Management V. Megalooikonomou Introduction to Data Mining ( based on notes by Jiawei Han and Micheline.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Chapter 7 Decision Tree.
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning RASTOGI, Rajeev and SHIM, Kyuseok Data Mining and Knowledge Discovery, 2000, 4.4.
Data Mining: Classification
Chapter 1 Introduction to Data Mining
Chapter 9 – Classification and Regression Trees
Chapter 4 Classification. 2 Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of.
Computational Intelligence: Methods and Applications Lecture 19 Pruning of decision trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Basic Data Mining Technique
Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
CS690L Data Mining: Classification
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 – DB Applications Data Warehousing / Data Mining (R&G, ch 25 and 26)
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Decision Trees Example of a Decision Tree categorical continuous class Refund MarSt TaxInc YES NO YesNo Married Single, Divorced < 80K> 80K Splitting.
Classification and Prediction
CIS671-Knowledge Discovery and Data Mining Vasileios Megalooikonomou Dept. of Computer and Information Sciences Temple University AI reminders (based on.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 – DB Applications Data Warehousing / Data Mining (R&G, ch 25 and 26) C. Faloutsos and.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Decision Trees.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Data Mining – Intro.
Chapter 6 Decision Tree.
Privacy-Preserving Data Mining
Decision Trees (suggested time: 30 min)
Data Warehousing / Data Mining
Data Mining Classification: Basic Concepts and Techniques
Data Mining: Concepts and Techniques Course Outline
Database Management Systems Data Mining - Data Cubes
Introduction to Data Mining, 2nd Edition by
Classification and Prediction
Introduction to Data Mining, 2nd Edition by
Data Warehousing and Data Mining
Data Mining – Chapter 3 Classification
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
CSCI N317 Computation for Scientific Applications Unit Weka
Data Mining: Concepts and Techniques
©Jiawei Han and Micheline Kamber
Promising “Newer” Technologies to Cope with the
Presentation transcript:

Spatial and Temporal Data Mining V. Megalooikonomou Introduction to Decision Trees ( based on notes by Jiawei Han and Micheline Kamber and on notes by Christos Faloutsos)

More details… Decision Trees Decision trees Problem Approach Classification through trees Building phase - splitting policies Pruning phase (to avoid over-fitting)

Decision Trees Problem: Classification – i.e., given a training set (N tuples, with M attributes, plus a label attribute) find rules, to predict the label for newcomers Pictorially:

Decision trees ??

Decision trees Issues: missing values noise ‘rare’ events

Decision trees types of attributes numerical (= continuous) - eg: ‘salary’ ordinal (= integer) - eg.: ‘# of children’ nominal (= categorical) - eg.: ‘car-type’

Decision trees Pictorially, we have num. attr#1 (e.g., ‘age’) num. attr#2 (e.g., chol-level)

Decision trees and we want to label ‘?’ num. attr#1 (e.g., ‘age’) num. attr#2 (e.g., chol-level) ?

Decision trees so we build a decision tree: num. attr#1 (e.g., ‘age’) num. attr#2 (e.g., chol-level) ? 50 40

Decision trees so we build a decision tree: age<50 Y + chol. <40 N -... Y N

Decision trees Typically, two steps: tree building tree pruning (for over-training/over-fitting)

Tree building How? num. attr#1 (e.g., ‘age’) num. attr#2 (e.g.,chol-level)

Tree building How? A: Partition, recursively - pseudocode: Partition ( dataset S) if all points in S have same label then return evaluate splits along each attribute A pick best split, to divide S into S1 and S2 Partition(S1); Partition(S2) Q1: how to introduce splits along attribute A i Q2: how to evaluate a split?

Tree building Q1: how to introduce splits along attribute A i A1: for numerical attributes: binary split, or multiple split for categorical attributes: compute all subsets (expensive!), or use a greedy approach

Tree building Q2: how to evaluate a split? A: by how close to uniform each subset is - ie., we need a measure of uniformity:

Tree building entropy: H(p+, p-) p Any other measure? p+: relative frequency of class + in S

Tree building entropy: H(p +, p - ) p ‘gini’ index: 1-p p - 2 p p+: relative frequency of class + in S

Tree building Intuition: entropy: #bits to encode the class label gini: classification error, if we randomly guess ‘+’ with prob. p +

Tree building Thus, we choose the split that reduces entropy/classification-error the most: E.g.: num. attr#1 (e.g., ‘age’) num. attr#2 (e.g., chol-level)

Tree building Before split: we need (n + + n - ) * H( p +, p - ) = (7+6) * H(7/13, 6/13) bits total, to encode all the class labels After the split we need: 0 bits for the first half and (2+6) * H(2/8, 6/8) bits for the second half

Tree pruning What for? num. attr#1 (eg., ‘age’) num. attr#2 (eg., chol-level)

Tree pruning Q: How to do it? num. attr#1 (eg., ‘age’) num. attr#2 (eg., chol-level)

Tree pruning Q: How to do it? A1: use a ‘training’ and a ‘testing’ set - prune nodes that improve classification in the ‘testing’ set. (Drawbacks?)

Tree pruning Q: How to do it? A1: use a ‘training’ and a ‘testing’ set - prune nodes that improve classification in the ‘testing’ set. (Drawbacks?) A2: or, rely on MDL (= Minimum Description Language) - in detail:

Tree pruning envision the problem as compression and try to minimize the # bits to compress (a) the class labels AND (b) the representation of the decision tree

(MDL) a brilliant idea – e.g.: best n-degree polynomial to compress these points: the one that minimizes (sum of errors + n )

Major Issues in Data Mining Issues relating to the diversity of data types Handling relational as well as complex types of data Mining information from heterogeneous databases and global information systems (WWW) Issues related to applications and social impacts Application of discovered knowledge Domain-specific data mining tools Intelligent query answering Process control and decision making Integration of the discovered knowledge with existing knowledge: A knowledge fusion problem Protection of data security, integrity, and privacy

Summary Data mining: discovering interesting patterns from large amounts of data A natural evolution of database technology, in great demand, with wide applications A KDD process includes data cleaning, data integration, data selection, transformation, data mining, pattern evaluation, and knowledge presentation Mining can be performed in a variety of information repositories Data mining functionalities: characterization, discrimination, association, classification, clustering, outlier and trend analysis, etc. Classification of data mining systems Major issues in data mining