Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III

Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk

Markov Models - Seminar Practical issues in computing the Baum Welch re-estimation formulae Choosing the number of hidden nodes Generative modelling and stochastic sampling Coursework

Computing the BW parameters (1) Choose =( ,A,B) at random (subject to probability constraints) A SunnyRainWet Sunny 0.60.30.1 Rain 0.10.60.3 Wet 0.2 0.6 BRedGreenBlue Sunny0.80.1 Rain0.10.80.1 Wet0.2 0.6 N hidden states M observable states N hidden states  =1  SunnyRainWet 0.60.30.1 N hidden states  =1

Computing the BW parameters (2) We want to be able to calculate:  t (i) comes from forwards evaluation  t+1 (j) comes from backwards evaluation Given O Have initial values for a ij and b j (O t+1 ) Can calculate P(O| ) but do we need to?

Computing the BW parameters (2) Can calculate P(O| ) but do we need to? P(O| ) is a normalising constant and is the same value for all  t (i,j) for any individual iteration Can ignore P(O| ) if we re-normalise =( ,A,B) at the end of the re-estimation

Computing the BW parameters (3) Recall the Scaling Factor SF t from the previous seminar … Intended to prevent arithmetic underflow when calculating  and  trellises Calculate SF t using  trellis and apply the same factors to the  trellis. SF t for  t = SF t for  t+1 (think why …)

Computing the BW parameters (4) Everything else should now be straightforward … Except … how to choose the number of hidden nodes N

Choosing N (1) What is the actual complexity of the underlying task? Too many nodes – over learning and lack of generalisation capability (model learns precisely only those patterns that occur in O) Too few nodes – over generalisation (model has now adequately captured the dynamics of the underlying task) Same problem as deciding how many hidden nodes there should be for a neural network

Choosing N (2) Log Likelihood / symbol N 0 -- Little additional performance with increasing N Optimal point

Generative modelling (1) OK, so now we know what a (Hidden) Markov Model is, and how to learn its parameters, but how is this all relevant to Cognitive/Computer Vision? – (H)MMs are generative models – Perception guided by expectation – Visual control – An example visual task …

Generative modelling (2) Simple case study: Visual task 153 4 2 Training sequence: {3,3,2,2,2,2,5,5,4,4,3,3,1,1,1}

Generative modelling (3) Example sequence 1 generated by HMM 5 observed states & 14 hidden states

Generative modelling (4) Example sequence 2 generated by HMM 5 observed states & 14 hidden states

Stochastic sampling To generate a sequence from =( ,A,B): Select starting state according to  distribution FOR t=1: T – Generate h t (N) (a 1*N distribution) using A (part of the  trellis computation – Select a state q according to  t (N) distribution – Generate o t (N) (a 1*N distribution) using q and B – Select an output symbol o t according to o t (N) distribution END_FOR

Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III

Similar presentations

Presentation on theme: "Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III

Similar presentations

Presentation on theme: "Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III"— Presentation transcript:

Similar presentations

About project

Feedback