THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L10: Model-Based Classification and Clustering Nevin.

Slides:



Advertisements
Similar presentations
1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
Advertisements

CS188: Computational Models of Human Behavior
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Bayesian Learning Provides practical learning algorithms
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
It has a C instead of an A The mutation is substitution It is a point mutation It changes the Ile into a Met.
Probability: Review The state of the world is described using random variables Probabilities are defined over events –Sets of world states characterized.
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L09: Graphical Models for Decision Problems Nevin.
Bayesian Classification
Naïve Bayes Classifier
Data Mining Classification: Naïve Bayes Classifier
COMP 328: Final Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology
Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –One exception: games with multiple moves In particular, the Bayesian.
Probabilistic inference
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Learning Maximum Likelihood Bounded Semi-Naïve Bayesian Network Classifier Kaizhu Huang, Irwin King, Michael R. Lyu Multimedia Information Processing Laboratory.
COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology
Lecture 16: Wrap-Up COMP 538 Introduction of Bayesian networks.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Introduction to Bayesian Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)
Lecture 15: Hierarchical Latent Class Models Based ON N. L. Zhang (2002). Hierarchical latent class models for cluster analysis. Journal of Machine Learning.
PGM: Tirgul 11 Na?ve Bayesian Classifier + Tree Augmented Na?ve Bayes (adapted from tutorial by Nir Friedman and Moises Goldszmidt.
Sample Midterm question. Sue want to build a model to predict movie ratings. She has a matrix of data, where for M movies and U users she has collected.
1er. Escuela Red ProTIC - Tandil, de Abril, Bayesian Learning 5.1 Introduction –Bayesian learning algorithms calculate explicit probabilities.
Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.
Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.
Today Logistic Regression Decision Trees Redux Graphical Models
Bayesian Networks Alan Ritter.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
Thanks to Nir Friedman, HU
Rutgers CS440, Fall 2003 Introduction to Statistical Learning Reading: Ch. 20, Sec. 1-4, AIMA 2 nd Ed.
Bayes Classification.
Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
Lecture 2: Bayesian Decision Theory 1. Diagram and formulation
Naive Bayes Classifier
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
Lung cancer By Hanchu Zhang, Billy Shi and Hongyu Mao.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Bayesian Learning Chapter Some material adapted from lecture notes by Lise Getoor and Ron Parr.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
Ashley Powell & Kori fox
MLE’s, Bayesian Classifiers and Naïve Bayes Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 30,
Chapter 6 Bayesian Learning
Slides for “Data Mining” by I. H. Witten and E. Frank.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
Classification And Bayesian Learning
1 Param. Learning (MLE) Structure Learning The Good Graphical Models – Carlos Guestrin Carnegie Mellon University October 1 st, 2008 Readings: K&F:
COMP24111 Machine Learning Naïve Bayes Classifier Ke Chen.
Bayesian Learning Bayes Theorem MAP, ML hypotheses MAP learners
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Bayesian Classification 1. 2 Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership.
Nevin L. Zhang Room 3504, phone: ,
Naive Bayes Classifier
Data Science Algorithms: The Basic Methods
Respiratory Disorders
Bayesian Classification
Markov Networks.
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 8 —
Prepared by: Mahmoud Rafeek Al-Farra
Naive Bayes Classifier
Intro. to Data Mining Chapter 6. Bayesian.
Presentation transcript:

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L10: Model-Based Classification and Clustering Nevin L. Zhang Room 3504, phone: , Home page

CSIT 5220 L10: Model-Based Classification and Clustering l Probabilistic Models (PMs) for Classification l PMs for Clustering Page 2

CSIT 5220 l The problem: n Given data: n Find mapping  (A1, A2, …, An) |- C l Possible solutions n ANN n Decision tree (Quinlan) n…n… n (SVM: Continuous data) Classification

CSIT 5220 Probabilistic Approach to Classification

CSIT 5220 Page 5 Will Boss Play Tennis?

CSIT 5220 Page 6 Will Boss Play Tennis?

CSIT 5220 Page 7

CSIT 5220 Page 8

CSIT 5220 Page 9

CSIT 5220 Page 10

CSIT 5220 Page 11 l Naïve Bayes model often has good performance in practice l Drawbacks of Naïve Bayes: n Attributes mutually independent given class variable n Often violated, leading to double counting. l Fixes: n General BN classifiers n Tree augmented Naïve Bayes (TAN) models n…n… Bayesian Networks for Classification

CSIT 5220 Page 12 l General BN classifier n Treat class variable just as another variable n Learn a BN. n Classify the next instance based on values of variables in the Markov blanket of the class variable. n Pretty bad because it does not utilize all available information because of Markov boundary Bayesian Networks for Classification

CSIT 5220 Page 13 Bayesian Networks for Classification l Tree-Augmented Naïve Bayes (TAN) model n Capture dependence among attributes using a tree structure. n During learning,  First learn a tree among attributes: use Chow-Liu algorithm  Special structure learning problem, easy  Add class variable and estimate parameters n Classification  arg max_c P(C=c|A1=a1, …, An=an)  BN inference  Many other methods

CSIT 5220 l Task: Find a tree model over observed variables that has maximum likelihood given data. l Maximized loglikelihood Chow-Liu Trees

CSIT 5220

l Mutual Information Chow-Liu Trees  Task is equivalent to finding maximum spanning tree of the following weighted and undirected graph:

CSIT 5220 Maximum Spanning Trees

CSIT 5220 l Illustration of Kruskal’s Algorithm

CSIT 5220 L10: Probabilistic Models (PMs) for Classification and Clustering Page 24 l Probabilistic Models (PMs) for Classification l PMs for Clustering

CSIT 5220 Page 25

CSIT 5220 Page 26

CSIT 5220 Page 27

CSIT 5220 Page 28

CSIT 5220 Page 29

CSIT 5220 Page 30

CSIT 5220 Page 31

CSIT 5220 Page 32

CSIT 5220 An Medical Application l In medical diagnosis, sometimes gold standard exists l Example: Lung Cancer n Symptoms:  Persistent cough, Hemoptysis (Coughing up blood), Constant chest pain, Shortness of breath, Fatigue, etc n Information for diagnosis: symptoms, medical history, smoking history, X-ray, sputum. n Gold standard:  Biopsy: the removal of a small sample of tissue for examination under a microscope by a pathologist

CSIT 5220 An Medical Application l Sometimes gold standard does not exist l Example: Rheumatoid Arthritis (RA) n Symptoms: Back Pain, Neck Pain, Joint Pain, Joint Swelling, Morning Joint Stiffness, etc n Information for diagnosis:  Symptoms, medical history, physical exam,  Lab tests including a test for rheumatoid factor.  (Rheumatoid factor is an antibody found in the blood of about 80 percent of adults with RA. ) n No gold standard:  None of the symptoms or their combinations are not clear-cut indicators of RA  The presence or absence of rheumatoid factor does not indicate that one has RA.

CSIT 5220 LC Analysis of Hannover Rheumatoid Arthritis Data n Class specific probabilities n Cluster 1: “disease” free n Cluster 2: “back-pain type” n Cluster 3: “Joint type” n Cluster 4: “Severe type”