Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
A Tutorial on Learning with Bayesian Networks
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Parameter Learning in MN. Outline CRF Learning CRF for 2-d image segmentation IPF parameter sharing revisited.
Hidden Markov Models. Room Wandering I’m going to wander around my house and tell you objects I see. Your task is to infer what room I’m in at every point.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Visual Tracking CMPUT 615 Nilanjan Ray. What is Visual Tracking Following objects through image sequences or videos Sometimes we need to track a single.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Cognitive Computer Vision
Hidden Markov Models First Story! Majid Hajiloo, Aria Khademi.
Visual Recognition Tutorial
Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1.
Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –One exception: games with multiple moves In particular, the Bayesian.
Lecture 5: Learning models using EM
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Hidden Markov Models David Meir Blei November 1, 1999.
Learning Bayesian Networks
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
A Unifying Review of Linear Gaussian Models
Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III
Learning In Bayesian Networks. Learning Problem Set of random variables X = {W, X, Y, Z, …} Training set D = { x 1, x 2, …, x N }  Each observation specifies.
Crash Course on Machine Learning
A Brief Introduction to Graphical Models
EM and expected complete log-likelihood Mixture of Experts
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Recap: Reasoning Over Time  Stationary Markov models  Hidden Markov models X2X2 X1X1 X3X3 X4X4 rainsun X5X5 X2X2 E1E1 X1X1 X3X3 X4X4 E2E2 E3E3.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
Dynamic Bayesian Networks and Particle Filtering COMPSCI 276 (chapter 15, Russel and Norvig) 2007.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
CS Statistical Machine learning Lecture 24
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Learning In Bayesian Networks. General Learning Problem Set of random variables X = {X 1, X 2, X 3, X 4, …} Training set D = { X (1), X (2), …, X (N)
CHAPTER 6 Naive Bayes Models for Classification. QUESTION????
QUIZ!!  In HMMs...  T/F:... the emissions are hidden. FALSE  T/F:... observations are independent given no evidence. FALSE  T/F:... each variable X.
CPSC 422, Lecture 19Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of.
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
Lecture 2: Statistical learning primer for biologists
ECE 8443 – Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem Proof EM Example – Missing Data Intro to Hidden Markov Models.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Bayesian Belief Propagation for Image Understanding David Rosenberg.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
Cognitive Computer Vision
Hidden Markov Models Part 2: Algorithms
Filtering and State Estimation: Basic Concepts
CSCI 5822 Probabilistic Models of Human and Machine Learning
Expectation-Maximization & Belief Propagation
Chapter14-cont..
Presentation transcript:

Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

Lecture 4 A family of graphical models We will see some example models: – Mixture models – Factor analysis – Hidden Markov Models – Dynamic Bayesian Networks – Coupled Hidden Markov Models Inference and Learning

So why are graphical models relevant to Cognitive CV? Precisely because they allows us to see different methods as instances of a broader probabilistic framework These methods are the basis for our model of perception guided by expectation We can put our model of expectation on a solid theoretical foundation We can develop well-founded methods of learning rather than just being stuck with hand-coded models

Reminder from previous lecture … A probabilistic graphical model is a type of probabilistic network that has roots in AI, statistics and neural networks Provides a clean mathematical formalism that makes it possible to understand the relationships between a wide variety of network based approaches to computation Allows to see different methods as instances of a broader probabilistic framework

A taxonomy of graphical models © Julie Vogel

Notation - reminder Squares denote discrete nodes Circles denote continuous valued nodes Clear denotes hidden node Shaded denotes observed node B A C

Mixture model as a graphical model Data point drawn from a fixed set of class in Y, but the class label for each data point is ‘missing’ X = class label x (hidden) Y = observed data point P(X=x,Y=y) = P(X=x).P(Y=y|X=x) Learning problem is to find P(Y|X) Inference problem is P(X=x|Y=y) Let’s try to put this in a vision context … Y X

Mixture model as a graphical model Let’s say our class label X has 4 possible values (wall,bush,sign, road) Each pixel in the image has a grey level y But we don’t know which class each pixel belongs to P(X=x,Y=y) = P(X=x).P(Y=y|X=x) Learning problem is to find P(Y|X) Inference problem is P(X=x|Y=y) Haven’t said how to learn P(Y|X) yet

Mixture model as a graphical model But we might have a spec for P(Y|X) y p(y|x) x=sign x=bush x=road x=wall

Mixture model as a graphical model How is this a generative model? – From lecture 2: – Estimate P(X) somehow – Calculate P(X,Y) using the training data (the set of pixels in our image) using P(X,Y) = P(Y|X). P(X) from the graphical model – P(Y|X) is the previous slide – we didn’t say how we learned it – Calculate P(X|Y) using Bayes Rule. This would assign each pixel to a class based on its grey level (classifier mode) To use generatively, use P(X) and sample P(X,Y) although this would produce a set of pixels without any spatial structure!!!

Factor analysis as graphical model In fact, this example of a mixture model where the underlying variable X is continuous (e.g. a Gaussian or normal distribution) is actually factor analysis

Factor analysis as graphical model y y p(y|x) w Factor analysis represents a high-dimensional vector Y as a linear combination of low dimensional features X

State space models State space models “roll out” their structure over time The graphical model shows which variable sets dictate how the model parameters will change over time – Hidden Markov Models – Dynamic Bayesian Networks – Kalman Filter Models – Coupled Hidden Markov Models

Hidden Markov Models Y1Y1 Y2Y2 Y3Y3 YTYT X1X1 X2X2 X3X3 XTXT

Dynamic Bayesian Networks DBNs have more than one (here up to N) underlying models X of behaviour and can switch from model to model with time Y1Y1 Y2Y2 Y3Y3 YTYT X11X11 X22X22 X33X33 XTNXTN

Switching Kalman Filter Models TRANSITION FUNCTION OBSERVATION FUNCTION Y3Y3 Y2Y2 X2X2 X3X3 X1X1 Y1Y1 U2U2 U3U3 U1U1

Coupled Hidden Markov Models Observation variable 1 Observation variable 2 Oliver, Rosario & Pentland, 1999)

Coupled Hidden Markov Models Observation variable 1 Observation variable 2 For a coupled model, the timescales associated with each underlying hidden model do not need to be the same Useful for coupling multi-modal signals such as audio and video

Inference and Learning Inference – Calculating a probability over a set of nodes given the values of other nodes – Estimating the value of hidden nodes given the values of the observed nodes – Computing statistical and information theoretic quantities like fit between model and observed data (likelihood), mutual information … Exact and approximate methods

Inference and Learning Learning – Learn parameters and/or structure from data – Maximise fit between model and observed data – Fixed model structure - discover best parameter set – Learn structure using information criteria (“scoring”) Structure Observability FullPartial KnownClosed formExpectation Maximisation (EM) UnknownLocal searchStructural EM

Summary Graphical models can be seen a family of models that allow different vision problems to be solved For example, a mixture model is a directed Bayes net which could be used to classify pixels State space models, for example the HMM, show how variable sets determine model parameters over time

Next time … Bayes Rule and Bayesian Networks A lot of excellent reference material on graphical models can be found at: