Download presentation

Presentation is loading. Please wait.

Published byNelson Eppes Modified about 1 year ago

1
Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center

2
Pen Technologies Pen-based interfaces in mobile computing

3
Mathematical Formulation H : Handwriting evidence on the basis of which a recognizer will make its decision – H = {h1, h2, h3, h4,…,hm} W : Word string from a large vocabulary – W = {w1, w2, w3, w4,…., wn} Recognizer : –

4
Mathematical Formulation SOURCE CHANNEL

5
Source Channel Model WRITERDIGITIZER FEATURE EXTRACTOR DECODER H CHANNEL

6
Source Channel Model Handwriting Modeling : HMMs Language Modeling SEARCH STRATEGY

7
Hidden Markov Models Memoryless Model Add Memory Hide Something Markov Model Mixture Model Hide Something Add Memory Hidden Markov Model Alan B Poritz : Hidden Markov Models : A Guided Tour ICASSP 1988

8
Memoryless Model COIN : Heads (1) : probability p Tails (0) : probability 1-p Flip the coin 10 times (IID Random sequence) Sequence : Probability = p*(1-p)*p*(1-p)*(1-p)*(1-p)*p*p*p*p =

9
Add Memory – Markov Model 2 Coins : COIN 1 => p(1) = 0.9, p(0) = 0.1 COIN 2 => p(1) = 0.1, p(0) = 0.9 Experiment : Flip COIN 1, Note the outcome If ( outcome = Head) Flip Coin 1 Else Flip Coin 2 End Sequence : Probability = 0.9*0.9*0.1*0.9 Sequence 1010 : Probability = 0.9*0.1*0.1*0.1

10
State Sequence Representation : : : : 0.9 Observed Output Sequence Unique State Sequence

11
Hide the states => Hidden Markov Model s1 s

12
Why use Hidden Markov Models Instead of Non-hidden? Hidden Markov Models can be smaller – less parameters to estimate States may be truly hidden – Position of the hand – Positions of articulators

13
Summary of HMM Basics We are interested in assigning probabilities p(H) to feature sequences Memoryless model – This model has no memory of the past Markov noticed that is some sequences the future depends on the past. He introduced the concept of a STATE – a equivalence class of the past that influences the future Hide the states : HMM

14
Hidden Markov Models Given a observed sequence H – Compute p(H) for decoding – Find the most likely state sequence for a given Markov model (Viterbi algorithm) – Estimate the parameters of the Markov source (training)

15
Compute p(H) s1 s p(a) p(b) s

16
Compute p(H) – contd. Compute p(H) where H = a a b b Enumerate all ways of producing h1=a s1 s2 s3 0.5x x x x

17
Compute p(H) – contd. Enumerate all ways of producing h1=a h2=a s1 s2 s3 0.5x x x x s1 s2 s3 0.5x x x x s2 s3 0.4x x0.3

18
Compute p(H) Can save computation by combining paths s1 s2 s3 s1 s2 s3 s2 s3

19
Compute p(H) Trellis Diagram s1 s2 s3 0aaaaabaabb.5x.8.5x.2.4x.5.3x.7.3x.3.5x.3.5x.7.2.1

20
Basic Recursion Prob (Node) = sum (Prob(predecessor) x Prob (predecessor->node) ) Boundary condition : Prob (s, 0) = 1 s1 s2 s3 0a aaaabaabb 1.0 s1, a : s1, a : 0.4 s1, 0 :.08 s1, a :.21 s2, a : s1, 0 : 0.2 s1, 0 :.032 s1, a :.084 s2, a :.066 s1, 0 :.0032 s1, b :.0144 s2, b :.0364 s1, 0 : s1, b : s2, b :.0108 s2, 0 :.033 s1, a : s2, 0 : 0.02 s2, 0 :.0182 s2, a :.0495 s2, 0 :.0054 s2, b :.0637 s2, 0 : s2, b :.0189

21
More Formally –Forward Algorithm

22
Find Most Likely Path for aabb - Dynamic Prog. or Viterbi Max Prob (Node) = MAX(Max(predecessor) x Prob (predecessor->node) ) s1 s2 s3 0a aaaabaabb 1.0 s1, a : 0.4s1, a :.16s1, b :.016 s1,b :.0016 s1, 0 :.08 s1, a :.21 s2, a :.04 s1, 0 : 0.2 s1, 0 :.032 s1, a :.084 s2, a :.066 s1, 0 :.0032 s1, b :.0144 s2, b :.0168 s1, 0 : s1, b : s2, b : s2, 0 :.021 s1, a :.03 s2, 0 : 0.02 s2, 0 :.0084 s2, a :.0315 s2, 0 : s2, b :.0294 s2, 0 : s2, b :.00588

23
Training HMM parameters 1/3 1/2 p(a) p(b) = H = abaa p(H) =

24
Training HMM parameters = A posterior probability of path i =

25
Training HMM parameters

26
Keep on repeating : 600 iterations : p(H) = Another initial parameter set : p(H) =

27
Training HMM parameters Converges to local maximum There are 7 (atleast) local maxima Final solution depends on starting point Speed of convergence depends on starting point

28
Training HMM parameters : Forward Backward algorithm Improves on enumerating algorithm by using the Trellis Results in reduction from exponential computation to linear computation

29
Forward Backward Algorithm j

30
Forward Backward Algorithm = Probability that hj is produced by and the complete output is H = = Probability of being in state and producing the output h1,.. hj-1 = Probability of being in state and producing the output hj+1,..hm

31
Forward Backward Algorithm Transition count

32
Training HMM parameters Guess initial values for all parameters Compute forward and backward pass probabilities Compute counts Re-estimate probabilities BAUM-WELCH, BAUM-EAGON, FORWARD-BACKWARD, E-M

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google