Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.2 Statistical Modeling Rodney Nielsen Many.

Slides:

Advertisements

Similar presentations

Data Mining: Metode dan Algoritma

Advertisements

What we will cover here What is a classifier

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.

Naïve Bayes Classifier

Naïve Bayes Classifier

On Discriminative vs. Generative classifiers: Naïve Bayes

Handling Uncertainty. Uncertain knowledge Typical example: Diagnosis. Consider data instances about patients: Can we certainly derive the diagnostic rule:

Algorithms: The basic methods. Inferring rudimentary rules Simplicity first Simple algorithms often work surprisingly well Many different kinds of simple.

1 Bayesian Classification Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Dan Weld, Eibe Frank.

1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.

Data Mining with Naïve Bayesian Methods

1 Bayesian Classification Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Dan Weld, Eibe Frank.

Review. 2 Statistical modeling  “Opposite” of 1R: use all the attributes  Two assumptions: Attributes are  equally important  statistically independent.

Handling Uncertainty. Uncertain knowledge Typical example: Diagnosis. Can we certainly derive the diagnostic rule: if Toothache=true then Cavity=true.

Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.

Algorithms for Classification: The Basic Methods.

Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.

Naïve Bayes Classifier Ke Chen Extended by Longin Jan Latecki COMP20411 Machine Learning.

NAÏVE BAYES CLASSIFIER 1 ACM Student Chapter, Heritage Institute of Technology 10 th February, 2012 SIGKDD Presentation by Anirban Ghose Parami Roy Sourav.

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人：黃子齊

Naïve Bayes Classifier Ke Chen Modified and extended by Longin Jan Latecki

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Classification: Bayes February 17, 2009.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.4: Covering Algorithms Rodney Nielsen Many.

Classification Techniques: Bayesian Classification

Naïve Bayes Classifier Ke Chen Modified and extended by Longin Jan Latecki

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.

1 Naïve Bayes Classification CS 6243 Machine Learning Modified from the slides by Dr. Raymond J. Mooney

Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.

 Classification 1. 2  Task: Given a set of pre-classified examples, build a model or classifier to classify new cases.  Supervised learning: classes.

Algorithms for Classification: The Basic Methods.

Data Mining – Algorithms: Naïve Bayes Chapter 4, Section 4.2.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Sections 4.1 Inferring Rudimentary Rules Rodney Nielsen.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.8: Clustering Rodney Nielsen Many of these.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Slide 1 DSCI 4520/5240: Data Mining Fall 2013 – Dr. Nick Evangelopoulos Lecture 5: Decision Tree Algorithms Material based on: Witten & Frank 2000, Olson.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall 6.8: Clustering Rodney Nielsen Many / most of these.

Machine Learning in Practice Lecture 5 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.5: Mining Association Rules Rodney Nielsen.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.1: Decision Trees Rodney Nielsen Many /

COMP24111 Machine Learning Naïve Bayes Classifier Ke Chen.

Chapter 4: Algorithms CS 795. Inferring Rudimentary Rules 1R – Single rule – one level decision tree –Pick each attribute and form a single level tree.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Credibility: Evaluating What’s Been Learned Predicting.

Data Mining Practical Machine Learning Tools and Techniques Chapter 6.5: Instance-based Learning Rodney Nielsen Many / most of these slides were adapted.

Data Mining Chapter 4 Algorithms: The Basic Methods Reporter: Yuen-Kuei Hsueh.

Data Mining Practical Machine Learning Tools and Techniques Chapter 6.3: Association Rules Rodney Nielsen Many / most of these slides were adapted from:

COMP24111 Machine Learning Naïve Bayes Classifier Ke Chen.

Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Mining Practical Machine Learning Tools and Techniques.

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 4 of Data Mining by I. H. Witten and E. Frank.

Bayesian Learning Reading: Tom Mitchell, “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Sections 1-2. (Linked from.

Data Science Algorithms: The Basic Methods

Data Science Algorithms: The Basic Methods

Algorithms for Classification:

Prepared by: Mahmoud Rafeek Al-Farra

Data Science Algorithms: The Basic Methods

Artificial Intelligence

Data Science Algorithms: The Basic Methods

Naïve Bayes Classifier

Decision Tree Saed Sayad 9/21/2018.

Classification Techniques: Bayesian Classification

Bayesian Classification

Machine Learning Techniques for Data Mining

Naïve Bayes Classifier

Data Mining CSCI 307, Spring 2019 Lecture 6

Naïve Bayes Classifier

Presentation transcript:

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.2 Statistical Modeling Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall

Rodney Nielsen, Human Intelligence & Language Technologies Lab Algorithms: The Basic Methods ● Inferring rudimentary rules ● Naïve Bayes, probabilistic model ● Constructing decision trees ● Constructing rules ● Association rule learning ● Linear models ● Instance-based learning ● Clustering

Rodney Nielsen, Human Intelligence & Language Technologies Lab Probabilistic Modeling ● “Opposite” of 1R: use all the attributes ● Two assumptions: Attributes are  Equally important  Statistically independent (given the class value) ● I.e., knowing the value of one attribute says nothing about the value of another (if the class is known) ● Independence assumption is never correct! ● But … this scheme works well in practice

Rodney Nielsen, Human Intelligence & Language Technologies Lab Probabilities for Weather Data 5/ 14 5 No 9/ 14 9 Yes Play 3/5 2/5 3 2 No 3/9 6/9 3 6 Yes True False True False Windy 1/5 4/5 1 4 NoYesNoYesNoYes 6/9 3/9 6 3 Normal High Normal High Humidity 1/5 2/ /9 4/9 2/ Cool2/53/9Rainy Mild Hot Cool Mild Hot Temperature 0/54/9Overcast 3/52/9Sunny 23Rainy 04Overcast 32Sunny Outlook NoTrueHighMildRainy YesFalseNormalHotOvercast YesTrueHighMildOvercast YesTrueNormalMildSunny YesFalseNormalMildRainy YesFalseNormalCoolSunny NoFalseHighMildSunny YesTrueNormalCoolOvercast NoTrueNormalCoolRainy YesFalseNormalCoolRainy YesFalseHighMildRainy YesFalseHighHotOvercast NoTrueHighHotSunny NoFalseHighHotSunny PlayWindyHumidityTempOutlook

Rodney Nielsen, Human Intelligence & Language Technologies Lab Probabilities for Weather Data ● A new day: ?TrueHighCoolSunny PlayWindyHumidityTemp.Outlook Likelihood of the two classes For “yes” = 2/9  3/9  3/9  3/9  9/14 = For “no” = 3/5  1/5  4/5  3/5  5/14 = Conversion into a probability by normalization: P(“yes”) = / ( ) = P(“no”) = / ( ) = 0.795

Rodney Nielsen, Human Intelligence & Language Technologies Lab Bayes’ Rule ● Probability of event H given evidence E: ● A priori probability of h : ● Probability of event before evidence is seen ● A posteriori probability of h : ● Probability of event after evidence is seen Thomas Bayes Born:1702 in London, England Died:1761 in Tunbridge Wells, Kent, England

Rodney Nielsen, Human Intelligence & Language Technologies Lab Bayes’ Rule ● Probability of event h given evidence e :

Rodney Nielsen, Human Intelligence & Language Technologies Lab Bayes’ Rule ● Probability of event h given evidence e : ● Independence assumption:

Rodney Nielsen, Human Intelligence & Language Technologies Lab Naïve Bayes for Classification ● Classification learning: what’s the probability of the class given an instance?  Evidence e = instance  Hypothesis h = class value for instance ● Naïve assumption: evidence splits into parts (i.e. attributes) that are independent

Rodney Nielsen, Human Intelligence & Language Technologies Lab Weather Data Example ?TrueHighCoolSunny PlayWindyHumidityTemp.Outlook Evidence E Probability of class “yes”

Rodney Nielsen, Human Intelligence & Language Technologies Lab The “Zero-frequency Problem” ● What if an attribute value doesn’t occur with every class value? (e.g. “Humidity = high” for class “yes”)  Probability will be zero!  A posteriori probability will also be zero! (No matter how likely the other values are!) ● Remedy: add 1 to the count for every attribute value-class combination (Laplace estimator) ● Result: probabilities will never be zero! (also: stabilizes probability estimates)

Rodney Nielsen, Human Intelligence & Language Technologies Lab Modified Probability Estimates ● In some cases adding a constant different from 1 might be more appropriate ● Example: attribute outlook for class yes Sunny OvercastRainy

Rodney Nielsen, Human Intelligence & Language Technologies Lab Missing Values ● Training: instance is not included in frequency count for attribute value-class combination ● Classification: attribute will be omitted from calculation ● Example: ?TrueHighCool? PlayWindyHumidityTemp.Outlook Likelihood of “yes” = 3/9  3/9  3/9  9/14 = Likelihood of “no” = 1/5  4/5  3/5  5/14 = P(“yes”) = / ( ) = 41% P(“no”) = / ( ) = 59%

Rodney Nielsen, Human Intelligence & Language Technologies Lab Numeric Attributes ● Usual assumption: attributes have a normal or Gaussian probability distribution (given the class) ● The probability density function for the normal distribution is defined by two parameters: ● Mean ● Standard deviation

Rodney Nielsen, Human Intelligence & Language Technologies Lab Statistics for Weather Data ● Example density value:

Rodney Nielsen, Human Intelligence & Language Technologies Lab Classifying a New Day ● A new day: ● Missing values during training are not included in calculation of mean and standard deviation Likelihood of “yes” = 2/9    3/9  9/14 = Likelihood of “no” = 3/5    3/5  5/14 = P(“yes”) = / ( ) = 25% P(“no”) = / ( ) = 75% ?true9066Sunny PlayWindyHumidityTemp.Outlook

Rodney Nielsen, Human Intelligence & Language Technologies Lab Naïve Bayes: Discussion ● Naïve Bayes works surprisingly well (even if independence assumption is clearly violated) ● Why? Because classification doesn’t require accurate probability estimates as long as maximum probability is assigned to correct class ● However: adding too many redundant attributes will cause problems (e.g. identical attributes) ● Note also: many numeric attributes are not normally distributed