Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 1 1 jrett.

Slides:



Advertisements
Similar presentations
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Advertisements

LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Dynamic Bayesian Networks (DBNs)
An Overview of Machine Learning
ECE 8443 – Pattern Recognition Objectives: Course Introduction Typical Applications Resources: Syllabus Internet Books and Notes D.H.S: Chapter 1 Glossary.
Pattern Classification. Chapter 2 (Part 1): Bayesian Decision Theory (Sections ) Introduction Bayesian Decision Theory–Continuous Features.
Bayesian Decision Theory
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Chapter 1: Introduction to Pattern Recognition
Bayesian Decision Theory Chapter 2 (Duda et al.) – Sections
OUTLINE Course description, What is pattern recognition, Cost of error, Decision boundaries, The desgin cycle.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 2: Bayesian Decision Theory (Part 1) Introduction Bayesian Decision Theory–Continuous Features All materials used in this course were taken from.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.
Crash Course on Machine Learning
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.
Pattern Recognition: Baysian Decision Theory Charles Tappert Seidenberg School of CSIS, Pace University.
Principles of Pattern Recognition
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Lecture 2: Bayesian Decision Theory 1. Diagram and formulation
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
Classification. An Example (from Pattern Classification by Duda & Hart & Stork – Second Edition, 2001)
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 02: BAYESIAN DECISION THEORY Objectives: Bayes.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Compiled By: Raj G Tiwari.  A pattern is an object, process or event that can be given a name.  A pattern class (or category) is a set of patterns sharing.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
Optimal Bayes Classification
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Lecture 2: Statistical learning primer for biologists
Introduction to Pattern Recognition (การรู้จํารูปแบบเบื้องต้น)
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Lecture 1.31 Criteria for optimal reception of radio signals.
Artificial Intelligence
Machine Learning for Computer Security
Ch3: Model Building through Regression
LECTURE 01: COURSE OVERVIEW
CS668: Pattern Recognition Ch 1: Introduction
Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas
Data Mining Lecture 11.
Course Outline MODEL INFORMATION COMPLETE INCOMPLETE
An Introduction to Supervised Learning
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Recognition and Machine Learning
LECTURE 01: COURSE OVERVIEW
LECTURE 07: BAYESIAN ESTIMATION
Multivariate Methods Berlin Chen
Machine Learning: Lecture 6
Machine Learning: UNIT-3 CHAPTER-1
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Presentation transcript:

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 1 1 jrett

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 2 2 jrett

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 3 3 jrett Retrospective Bayesian Multimodal Perception by J. F. Fereira Bayes' theorem - Bayes rule Knowledge of past behavior and state form prediction of current state Non-Gaussian likelihood functions Multimodal Sensing in human perception

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 4 4 jrett Retrospective distribution of object position unknown => flat Noise in each modality is independent bimodal posterior distribution = product of the unimodal distributions Simplification: Probability distributions are Gaussian

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 5 5 jrett 1. Introduction to Pattern Recognition Example: “Sorting incoming Fish on a conveyor according to species using optical sensing”

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 6 6 jrett 1. Introduction to Pattern Recognition Selecting length feature Example: Fish Classifier Selecting lightness feature

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 7 7 jrett 1. Introduction to Pattern Recognition Example: Fish Classifier Selecting two features and defining a simple straight line as decision boundary Best performance but complicated classifier – will not perform well with novel patterns Search for the optimal tradeoff between performance on the training set and simplicity

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 8 8 jrett 1. Introduction to Pattern Recognition post-processing classification feature extraction segmentation sensing input decision Invariant Features Translation Rotation Scale Occlusion Projective Distortion Rate Deformation Feature Selection Noise Missing Features Error Rate Risk Context Multiple Classifiers Pattern Recognition System

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 9 9 jrett 1. Introduction to Pattern Recognition collect data choose features choose model train classifier evaluate classifier end start Prior Knowledge Overfitting Design Cycle

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 10 jrett 2. Continouos Features State of nature  Finite set of c states of nature (‘categories’) {  1, …,  c } Prior P(  j ) If the state of nature is finite: Decision rule (for c =2): Decide  1 if P(  1 ) > P(  2 ) ; otherwise decide  2

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 11 jrett 2. Continouos Features Feature vector x : x   d the feature space x is (for d=1) a continuous random variable x Class(State)-conditional probability density function: p ( x |  j ) expresses the distribution of x depending on the state of nature

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 12 jrett 2. Continouos Features Bayes formula (Posterior) Evidence Bayes Decision rule (for c =2): Decide  1 if P (  1 | x ) > P (  2 | x ) ; otherwise decide  2 Bayes Decision rule (expressed in terms of Priors): Decide  1 if p(x|  1 ) P (  1 ) > p(x|  2 ) P (  2 ) ; otherwise decide  2

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 13 jrett 2. Continouos Features Conditional Risk We can minimize our expected loss by selecting the action that minimizes the conditional risk. This Bayes decision procedure provides the optimal performance Two-Category Classification Bayes Risk Decide  1 if ( ) P (  1 | x ) > ( ) P (  2 | x ) ; otherwise decide  2

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 14 jrett Discrete Features Probabilities rather than probability densities. Bayes decision rule To minimize the overall risk, select the action  I for which R(  i |x) is minimum Feature vector x can assume m discrete values Posterior Evidence Risk

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 15 jrett Discrete Features Example: Independent Binary Features 2 category problem Feature vector x = {x 1, …, x d } T where xi = {0;1}

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 16 jrett Bayesian Belief Networks Represents knowledge about a distribution. Knowledge: Statistical Dependencies – Causal Relations among the component variables Knowledge from e.g. structural information Graphical representation: Bayesian Belief Nets node variable P(a) link parents (of C) children (of E)

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 17 jrett Bayesian Belief Networks Applying Bayes rule to determine the probability of any configuration of variables in the joint distribution. P(a 1 )P(a 2 ) P(c 1 |a k )P(c 2 |a k ) a1a a2a  =1 Discrete Case: Discrete number of possible values A (e.g. 2: a={a1, a2} and continues-valued probabilities Conditional Probability Table

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 18 jrett Bayesian Belief Networks Determining the probabilities of the variables P(a)P(b|a)P(c|b)P(d|c) ABCD independance Summing the full joint distribution P(a,b,c,d) over all variables other than d E.g.: Probability distribution over d 1, d 2, … at D simple split P(b) P(c) P(d) simple interpretation Probability of a particular value of D

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 19 jrett Bayesian Belief Networks Give the values of some variables (evidence e) … and search to determine some particular configuration of other variables x

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 20 jrett Bayesian Belief Networks Example: Belief Network for Fish As usual: Compute P(x 1 salmon) and P(x 2 sea bass) Decide for the minimum expected classification error Ex.2 Classify the fish: Known: Fish is light (c 1 ) and caught in the south Atlantic (b 2 ). Unknown: Time of year (a), thickness (d) In this case D does not affect our results

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 21 jrett Bayesian Belief Networks Example: Belief Network for Fish

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 22 jrett Bayesian Belief Networks Example: Belief Network for Fish And if the dependency relation is unknown? naïve Bayes – idiot Bayes Features are conditionally independant After normalization:

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 23 jrett Compound Bayesian Decision Theory Consecutive  ’s not statistically independent => exploit dependence => improved performance Wait for n states to emerge and make all n decisions jointly = compound decision problem States of nature  = (  (1), …,  ( n )) T taking one of c values {  1, …,  c } Prior P(  ) for n states of nature Feature matrix X : =(x 1, …, x n ) x i obtained when state of nature was  i n observations

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 24 jrett Compound Bayesian Decision Theory Define loss matrix for the compound decision problem. Seek decision rule the minimizes the compound risk (optimal procedure) Assumption: Correct = no loss Errors = equally costly => simply calculate P (  |X) for all  and select  for which P (.) is maximum. practice: calculate P(  |X) is time expensive assumption: x i depends only on  (i) not on other x or  Conditional probability density function: p (X|  ) for X given the true set of  Posterior joint density

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 25 jrett Obrigado!

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 26 jrett Annex

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 27 jrett Book: Pattern Cl. Preface Ch 1: Introduction Ch 2: Bayesian Decision Theory Ch 3: Maximum Likelihood and Bayesian Estimation Ch 4: Nonparametric Techniques Ch 5: Linear Discriminant Functions Ch 6: Multilayer Neural Networks Ch 7: Stochastic Methods Ch 8: Nonmetric Methods Ch 9: Algorithm-Independent Machine Learning Ch 10: Unsupervised Learning and Clustering App A: Mathematical Foundations

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 28 jrett 2Bug Algorithms17 3Configuration Space39 4Potential Functions77 5Roadmaps107 6Cell Decompositions161 7Sampling-Based Algorithms197 8Kalman Filtering269 9Bayesian Methods301 10Robot Dynamics349 11Trajectory Planning373 12Nonholonomic and Underactuated Systems401 Book: Principles of...

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 29 jrett Book: Artificial... Preface Part I Artificial Intelligence Part II Problem Solving Part III Knowledge and Reasoning Part IV Planning Part V Uncertain Knowledge and Reasoning Part VI Learning Part VII Communicating, Perceiving, and Acting Part VIII Conclusions

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 30 jrett Book: Bayesian... Preface xix Part I: Fundamentals of Bayesian Inference 1 1 Background 3 2 Single-parameter models 33 3 Introduction to multiparameter models 73 4 Large-sample inference and frequency properties of Bayesian inference 101 Part II: Fundamentals of Bayesian Data Analysis Hierarchical models Model checking and improvement Modeling accounting for data collection Connections and challenges General advice 259 Part III: Advanced Computation Overview of computation Posterior simulation Approximations based on posterior modes Special topics in computation 335 Part IV: Regression Models Introduction to regression models Hierarchical linear models Generalized linear models Models for robust inference Mixture models Multivariate models Nonlinear models Models for missing data Decision analysis 541 Appendixes 571

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 31 jrett Book: Classification... Preface. Foreword. 1. Introduction. 2. Detection and Classification. 3. Parameter Estimation. 4. State Estimation. 5. Supervised Learning. 6. Feature Extraction and Selection. 7. Unsupervised Learning. 8. State Estimation in Practice. 9. Worked Out Examples. Appendix

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 32 jrett Images

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 33 jrett 2. Simple Example Designing a simple classifier for gesture recognition. The observer tries to predict which gesture might be performed next. The sequence of gestures appears to be random. Ten types of gestures: 1. Big circle 2. Small circle 3. Vertical Line 4. Horizontal Line 5. Pointing North-West 6. Pointing West 7. Talk louder 8. Talk more quiet 9. Wave Bye-Bye 10. I am hungry State of nature   Type of gesture (  1 …  10 ) We assume that there is some a priori probability (i.e. prior) P(  1 ) that the next gesture is ‘Big Circle’, P(  2 ) that the next gesture is ‘Small Circle’, etc. If the gesture lexicon is finite:

Institute of Systems and Robotics ISR – Coimbra Mobile Robotics Lab Bayesian Approaches 34 jrett Missing and noisy features Missing Features: Example: x1 is missing measured value of x2 is x^2 mean x1 points to omega 3 but omega2 better decision