Presentation on theme: "An Introduction to Hidden Markov Models and Gesture Recognition Troy L. McDaniel Research Assistant Center for Cognitive Ubiquitous Computing Arizona State."— Presentation transcript:
An Introduction to Hidden Markov Models and Gesture Recognition Troy L. McDaniel Research Assistant Center for Cognitive Ubiquitous Computing Arizona State University Notation and Algorithms From (Dugad and Desai 1996) Please send your questions, comments, and errata to Troy.McDaniel@asu.edu
The Big Picture We learned about this part Now lets take a closer look at how this part works…
Introduction A hidden Markov model can be used to recognize any temporal or modeling sequence How? We can train a finite state machine using training data consisting of sequences of symbols States will represent, e.g., poses for gestures, and transitions between states will have probabilities HMM goodbye
Applications Speech Recognition Computational Biology Computer Vision Biometrics Gesture Recognition And many others… Lets take a look at gesture recognition in detail…
Gesture Recognition-Training Interact with a computer through gestures Training Create a database of gestures We store the feature vectors of poses that make up each gesture Create a database of poses to increase accuracy Train HMMs for each class of gestures Goodbye Gesture Single Pose
Gesture Recognition-Testing Testing Segmentation – Obtain the user’s hand by identifying skin color pixels. This performs background subtraction. Feature Extraction – Extract features. For example, we can fit ellipsoids around fingers and palm, and use their major axes and angles between them. Pose Recognition – Match feature vectors with those in the pose database to improve recognition. Gesture Recognition – Run gestures through all of the HMMs. The HMM with the highest probability is the recognized gesture.
Gesture Recognition System Overview of system Next, we will learn how HMMs work…
Urns and Marbles Example There are 3 urns filled with any number of marbles each of a certain color, say red, green or blue A friend of ours is in a room choosing urns, each time taking out a marble, shouting the color, and putting it back We’re outside the room and cannot see in! We know the # of urns and observations (R, R, G, R, B,..) But what is it that we don’t know? RED! He just saw a red!
Urns and Marbles Example-ll The urns are states, each with an initial probability Transition probabilities exist between states The Markovian property Each state represents a distribution of symbols (E.g., red = 25%, green = 25% and blue = 50% for urn 1)
So What’s An HMM? As we’ve already seen, it is a finite number of states connected by transitions, which can generate an observation sequence depending on its transition, bias, and initial probabilities It is represented as a set of three sets of probabilities The Markov model is hidden because we don’t know which state led to each observation Going from the urn example to more familiar models…
So What’s An HMM?-ll For gesture recognition, a state will represent a pose The distribution for each state will be symbols represented by feature vectors—e.g., the major axes of fingers and palm, and the angles between them. Remember that during training, each gesture, even though it may belong to the same class (goodbye, etc.), will have variations. An HMM can either represent a single object such as a word or gesture, or a collection of objects.
The Algorithms Next, we’re going to cover algorithms for training and testing hidden Markov models Algorithms include Forward-Backward , Viterbi , K-means, Baum-Welch , and the Kullback-Leibler based distance measure  Each algorithm, once explained, will be mapped to pseudocode
HMM Structure Pseudocode For the pseudocode, assume that HMMs are objects, containing the constants and data structures below.
Problem #1 HMM applications are reduced to solving 3 problems. Lets look at the first one… Problem 1: Given, how do we compute P(O| )? Solution: Forward-Backward Algorithm Why do we care? And when do we use it? What’s the probability of getting B, G, R, B?
Why Do We Care? Red, Green, Blue 98%5% HMM 1HMM 3HMM 2 50%
But First, the Brute Force Approach Lets look at the brute force approach  first We can find this probability by finding the probability of O for a fixed state sequence times the probability of getting that state sequence But we do this for every possible state sequence… With N T possible state sequences, it’s not practical. Blue, Green, Red, Blue Urn 1 Urn 2 Urn 3 Urn 1 Urn 2 Urn 3 Urn 1 Urn 2 Urn 3 Urn 1 Urn 2 Urn 3 N T
Forward Algorithm A more practical approach: Forward Algorithm  The forward variable The probability of the partial observation sequence up to time t and state i at time t It is an inductive algorithm, shown next… What’s the probability of getting B, G, R, B, and ending at urn 2?
Forward Algorithm-ll Order N 2 T multiplications!
Forward Algorithm Example 0.1250.01920.0071 0.050.03510.0149 0.11250.00470.0075 Time States 1 2 3 1 2 3 What’s the probability of R, G, B? Just add up the circled values… It’s 2.95%!
Backward Algorithm Next is the Backward Algorithm  The backward variable The probability of the observation sequence O t+1, O t+2, …, O T given an HMM and state i at time t Similar, but important distinctions from the forward variable These differences allow us to break a sequence in half and attack it from both ends Reduced run time Allows for novel algorithms
Backward Algorithm Example 0.120.51 0.11250.51 0.07880.51 Time States 1 2 3 1 2 3 What’s the probability of R, G, B? 0.5*0.25*0.12 + 0.25*0.2*0.1125 + 0.25*0.45*0.0788 = 2.9%
Problem #2 Problem 2: Given, find a state sequence I such that the occurrence of the observation sequence O is greater than from any other state sequence. I.e., find a state sequence such that P(O, I| ) is maximized. Solution: Viterbi Algorithm  Why do we care? And when do we use it? What sequence of urns will give us the best chance of getting B, G, B? 1 3 2
Why Do We Care? A particular state sequence within a hidden Markov model can correspond to a certain object, such as the word ‘hello’, which is made up of phonemes represented as states. … This highest-probability sequence may correspond to a particular word or gesture, for example.
Viterbi Algorithm 0.6 = 0.3 12 Cost is -ln(a ij b j (O t )) = -ln(0.6*0.3) = 1.71 As these increase… …the cost decreases!
Viterbi Algorithm-ll So, it all comes down to finding the path with the minimum cost! A low probability = a large cost A high probability = a small cost High Probability -> Low CostLow Probability -> High Cost
Viterbi Algorithm-lll (and order of N 2 T multiplications!)
Viterbi Algorithm Example 2.07943.9278 2.99573.2833 2.18483.5711 Time States 1 2 3 1 2 What’s the best path for of R, B? 03 01 03 Time States 1 2 3 1 2 aTable sTable Take the minimum value here, match it with this entry, trace backward, and we get a path of 1,2.
Problem #3 How can we maximize the probability of getting B, G, B? Or maximize R, B, G, and it’s best state sequence 1, 3, 2?
K-means Algorithm Training Data K-means Trainer Trained HMM
K-means Algorithm Example 1 2 1 2 2 Initial means 2 1 2 1 Classify Calculate new means 1 1 1 1 2 22 Re-classify Let the colors of marbles in our urns take on decimal values, and be a function of R, G, B. Points in R, G, B space HMM Generation 33 3
References  R. Dugad and U. B. Desai, “A Tutorial on Hidden Markov Models,” Published Online, May 1996. See http://vision.ai.uiuc.edu/dugad/. http://vision.ai.uiuc.edu/dugad/.  L. R. Rabiner and B. H. Juang, “An introduction to hidden Markov models,” IEEE ASSP Mag., pp. 4-16, Jun. 1986.