Download presentation
Presentation is loading. Please wait.
1
Data Analytics UNIT-IV :Classification
2
Chapter Sections Decision trees- Overview, general algorithm, decision tree algorithm, evaluating a decision tree. Naïve Bayes – Bayes‟ Algorithm, Naïve Bayes Classifier, smoothing, diagnostics. Diagnostics of classifiers, Additional classification methods.
3
Classification Classification is widely used for prediction
Most classification methods are supervised This chapter focuses on two fundamental classification methods Decision trees Naïve Bayes
4
Decision Trees Tree structure specifies sequence of decisions
Given input X={x1, x2,…, xn}, predict output Y Input attributes/features can be categorical or continuous Node = tests a particular input variable Root node, internal nodes, leaf nodes return class labels Depth of node = minimum steps to reach node Branch (connects two nodes) = specifies decision Two varieties of decision trees Classification trees: categorical output, often binary Regression trees: numeric output
5
Decision Trees Overview of a Decision Tree
Example of a decision tree Predicts whether customers will buy a product
6
Decision Trees Overview of a Decision Tree
Example: will bank client subscribe to term deposit?
7
Decision Trees The General Algorithm
Construct a tree T from training set S Requires a measure of attribute information Simplistic method (data from previous Fig.) Purity = probability of corresponding class E.g., P(no)=1789/2000=89.45%, P(yes)=10.55% Entropy methods Entropy measures the impurity of an attribute Information gain measures purity of an attribute
8
Decision Trees The General Algorithm
Entropy methods of attribute information Hx = the entropy of X Information gain of an attribute = base entropy – conditional entropy
9
Decision Trees The General Algorithm
Construct a tree T from training set S Choose root node = most informative attribute A Partition S according to A’s values Construct subtrees T1, T2… for the subsets of S recursively until one of following occurs All leaf nodes satisfy minimum purity threshold Tree cannot be further split with min purity threshold Other stopping criterion satisfied – e.g., max depth
10
Decision Trees Decision Tree Algorithms
ID3 Algorithm T=training set, P=output variable, A=attribute
11
Decision Trees Decision Tree Algorithms
C4.5 Algorithm Handles missing data Handles both categorical and sontinuous variables Uses bottom-up pruning to address overfitting CART (Classification And Regression Trees) Also handles continuous variables Uses Gini diversity index as info measure
12
Decision Trees Evaluating a Decision Tree
Decision trees are greedy algorithms Best option at each step, maybe not best overall Addressed by ensemble methods: random forest Model might overfit the data Blue = training set Red = test set Overcome overfitting: Stop growing tree early Grow full tree, then prune
13
Decision Trees Evaluating a Decision Tree
Decision trees -> rectangular decision regions
14
Decision Trees Evaluating a Decision Tree
Advantages of decision trees Computationally inexpensive Outputs are easy to interpret – sequence of tests Show importance of each input variable Decision trees handle Both numerical and categorical attributes Categorical attributes with many distinct values Variables with nonlinear effect on outcome Variable interactions
15
Decision Trees Evaluating a Decision Tree
Disadvantages of decision trees Sensitive to small variations in the training data Overfitting can occur because each split reduces training data for subsequent splits Poor if dataset contains many irrelevant variables
16
Chapter Sections Decision trees- Overview, general algorithm, decision tree algorithm, evaluating a decision tree. Naïve Bayes – Bayes‟ Algorithm, Naïve Bayes Classifier, smoothing, diagnostics. Diagnostics of classifiers, Additional classification methods.
17
Naïve Bayes The naïve Bayes classifier
Based on Bayes’ theorem (or Bayes’ Law) Assumes the features contribute independently Features (variables) are generally categorical Discretization of continuous variables is the process of converting continuous variables into categorical ones Output is usually class label plus probability score Log probability often used instead of probability
18
Naïve Bayes Bayes Theorem
where C = class, A = observed attributes Typical medical example Used because doctor’s frequently get this wrong
19
Naïve Bayes Naïve Bayes Classifier
Conditional independence assumption And dropping common denominator, we get Find cj that maximizes P(cj|A)
20
Naïve Bayes Naïve Bayes Classifier
Example: client subscribes to term deposit? The following record is from a bank client. Is this client likely to subscribe to the term deposit?
21
Naïve Bayes Naïve Bayes Classifier
Compute probabilities for this record
22
Naïve Bayes Naïve Bayes Classifier
Compute Naïve Bayes classifier outputs: yes/no The client is assigned the label subscribed = yes The scores are small, but the ratio is what counts Using logarithms helps avoid numerical underflow
23
Naïve Bayes Smoothing A smoothing technique assigns a small nonzero probability to rare events that are missing in the training data E.g., Laplace smoothing assumes every output occurs once more than occurs in the dataset Smoothing is essential – without it, a zero conditional probability results in P(cj|A)=0
24
Naïve Bayes Diagnostics
Naïve Bayes advantages Handles missing values Robust to irrelevant variables Simple to implement Computationally efficient Handles high-dimensional data efficiently Often competitive with other learning algorithms Reasonably resistant to overfitting Naïve Bayes disadvantages Assumes variables are conditionally independent Therefore, sensitive to double counting correlated variables In its simplest form, used only for categorical variables
25
Naïve Bayes Naïve Bayes in R
This section explores two methods of using the naïve Bayes Classifier Manually compute probabilities from scratch Tedious with many R calculations Use naïve Bayes function from e1071 package Much easier – starts on page 222 Example: subscribing to term deposit
26
Chapter Sections Decision trees- Overview, general algorithm, decision tree algorithm, evaluating a decision tree. Naïve Bayes – Bayes‟ Algorithm, Naïve Bayes Classifier, smoothing, diagnostics. Diagnostics of classifiers, Additional classification methods.
27
Diagnostics of Classifiers
The book covered three classifiers Logistic regression, decision trees, naïve Bayes Tools to evaluate classifier performance Confusion matrix
28
Diagnostics of Classifiers
Bank marketing example Training set of 2000 records Test set of 100 records, evaluated below
29
Diagnostics of Classifiers
Evaluation metrics
30
Diagnostics of Classifiers
Evaluation metrics on bank marketing 100 test set poor poor
31
Diagnostics of Classifiers
ROC curve: good for evaluating binary detection Bank marketing: 2000 training set test set > banktrain<-read.table("bank-sample.csv",header=TRUE,sep=",") > drops<-c("balance","day","campaign","pdays","previous","month") > banktrain<-banktrain[,!(names(banktrain) %in% drops)] > banktest<-read.table("bank-sample-test.csv",header=TRUE,sep=",") > banktest<-banktest[,!(names(banktest) %in% drops)] > nb_model<-naiveBayes(subscribed~.,data=banktrain) > nb_prediction<-predict(nb_model,banktest[,-ncol(banktest)],type='raw') > score<-nb_prediction[,c("yes")] > actual_class<-banktest$subscribed=='yes' > pred<-prediction(score,actual_class) # code problem
32
Diagnostics of Classifiers
ROC curve: good for evaluating binary detection Bank marketing: 2000 training set test set
33
Chapter Sections Decision trees- Overview, general algorithm, decision tree algorithm, evaluating a decision tree. Naïve Bayes – Bayes‟ Algorithm, Naïve Bayes Classifier, smoothing, diagnostics. Diagnostics of classifiers, Additional classification methods.
34
Additional Classification Methods
Ensemble methods that use multiple models Bagging: bootstrap method that uses repeated sampling with replacement Boosting: similar to bagging but iterative procedure Random forest: uses ensemble of decision trees These models usually have better performance than a single decision tree Support Vector Machine (SVM) Linear model using small number of support vectors
35
Summary How to choose a suitable classifier among
Decision trees, naïve Bayes, & logistic regression
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.