Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Christian VIARD-GAUDIN Séminaire IRCCyN 6 Janvier Institut de Recherche en Communications et Cybernétique de Nantes 2 0 00.

Similar presentations


Presentation on theme: "1 Christian VIARD-GAUDIN Séminaire IRCCyN 6 Janvier Institut de Recherche en Communications et Cybernétique de Nantes 2 0 00."— Presentation transcript:

1 1 Christian VIARD-GAUDIN Séminaire IRCCyN 6 Janvier Institut de Recherche en Communications et Cybernétique de Nantes

2 Copyright © IRCCyN/CVG 2 Multimédia et Réseaux Vidéo et Multimédia Écrit et Documents Modélisation psychovisuelle Équipe Image et Vidéo Communications Modèle z

3 Copyright © IRCCyN/CVG 3 SOMMAIRE n TUTORIAL SUR LES MODÈLES MARKOV CACHÉS n APPLICATION A LA RECONNAISSANCE DE L’ÉCRITURE MANUSCRITE

4 Copyright © IRCCyN/CVG 4 A real-world signal Origins and Scopes n Late 1960’s (BAUM, 1967) : – Basic theory n Late 1980’s : – Large widespread understanding – Application to speech recognition n Used to model the behavior of speech, characters, Temperature, stock market, … By mean of a statistical approach Première Partie : Modèles de Markov Cachés (HMM’s) Parametric random process

5 Copyright © IRCCyN/CVG 5 Hidden Markov Model (1) n An HMM is a double stochastic process 1) an underlying stochastic process generates a sequence of states : q 1, q 2, …, q t,... q T, Wheret : discrete time, regularly spaced T : length of the sequence q t  Q = {q 1, q 2,... q N } N : the number of possible states

6 Copyright © IRCCyN/CVG 6 Markov Chain Hypotheses 1) First order : probabilistic description is truncated to just the current state and the predecessor state : P[q t =q j |q t-1 =q i, q t-2 = q k, …] = P[q t =q j |q t-1 =q i ] 2) Stationarity : probabilities are time invariant : P[q t =q j |q t-1 =q i ] = a ij 1  i, j  N This defines a square NxN matrix, A = {a ij } (state transition probability matrix) where a ij  0 and The value of each state is unobservable, but... 3) an initial state distribution  = {  i } should also be defined :  i = P[q 1 =q i ] with 1  i  Nwhere  i  0 and

7 Copyright © IRCCyN/CVG 7 x t  X = {x 1, x 2,... x M } Hidden Markov Model (2) n An HMM is a double stochastic process 1) an underlying stochastic process generates a sequence of states : q 1, q 2, …, q t,... q T, 2) each state emits an observation according to a second stochastic process : qtqt x i : a discrete symbol M : number of symbols But...

8 Copyright © IRCCyN/CVG 8 Observation Hypothesis 1) The observation x t depends only of the present state q t : P[x t = x j | q t = q i ] = b ij This defines a N x M matrix B: observation probability matrix B = {b ij }where b ij  0 and A complete specification of an HMM ( ) requires : Two model parameters N and M A specification of the symbols to observe Three probability measures : , A, B = ( , A, B)

9 Copyright © IRCCyN/CVG 9 Let us play with this model now ! Example 1 : (Not Hidden, Just a Discrete Markov Model) n Model of the weather in Nantes –N = M = 3 –State = observation : Q = {q 1 =rain, q 2 =cloudy, q 3 =sunny} –t is sampled every day, at noon for instance –A Nantes = {a ij } =

10 Copyright © IRCCyN/CVG 10 NANTES Weather Model n Given that the weather today (t=1) is sunny (q 1 = q 3 ) Answer these questions ¶What will be the weather tomorrow (t = 2) ? ·What is the probability of rain for the day after tomorrow (t = 3) ? ¸And for the day after (t = 4) ? ¹What is the probability the coming week will be “sun-sun-rain-rain-sun-cloudy-sun”

11 Copyright © IRCCyN/CVG 11 NANTES Weather Model Two more questions : ºWhat is the probability of rain for d consecutive days ? (ex. d = 3), »What is the average number of consecutive sunny days ? Cloudy days ? Rainy days ?

12 Copyright © IRCCyN/CVG 12 NANTES Weather Model n Answers ¶What will be the weather tomorrow ? A = Use of a trellis : ·What is the probability of rain for the day after tomorrow ? RCSRCS R C S

13 Copyright © IRCCyN/CVG 13 NANTES Weather Model ¸And for the day after ? Just extend the trellis, from the previous values A = q 1 : rain q 2 : cloudy q 3 : sunny ¹What is the probability the coming week will be P(q 4 =q 1 ) = P(q 3,q 3,q 1,q 1,q 3,q 2,q 3 ) = “sun-sun-rain-rain-sun-cloudy-sun”

14 Copyright © IRCCyN/CVG 14 ºWhat is the probability that rain lasts for d consecutive days ? (ex. d = 3), More generally : P(q i, q i, …q i,q i  j ) = hence P(q 1, q 1, q 1, q j  1 ) = NANTES Weather Model

15 Copyright © IRCCyN/CVG 15 NANTES Weather Model »What is the average number of consecutive sunny days ? Cloudy days ? Rainy days ?

16 Copyright © IRCCyN/CVG 16 NANTES Weather Model A = Topological graphic representation : q 2 = cloudy q 1 = rain q 3 = sunny This is an ergodic model : from one state, all other states are reachable

17 Copyright © IRCCyN/CVG 17 Example 2 : Extension to HMM n Coin tossing experiment : We do not see how the tossing is performed : one coin, multiple coins, biased coins ? We only have access to the result, a sequence of tosses, In that case, M = 2 and X = {x 1 = Head, x 2 = Tail} Observation sequence : (x 1,x 2, …. x T ) = (H,H,T,T,T,H, …H) The problem are : n How to build a model to explain the observed sequence ? n What are the states ? n How many states ?

18 Copyright © IRCCyN/CVG 18 n Coin tossing experiment : model 1 : assume only one biased coin 2 states (N = 2) Q = {q 1 = H, q 2 = T} Model topology : q 2 = Tail q 1 = Head P(H) 1-P(H) P(H) 1-P(H) Only one parameter is needed : P(H), it defines matrix A model 2 : assume two biased coins - 2 states (N = 2) Q = {q 1 = Coin1, q 2 = Coin2} - 2 different observations (M = 2) X = {x 1 = H, x 2 = T}

19 Copyright © IRCCyN/CVG 19 q 2 = Coin2 q 1 = Coin1 a 11 1-a 11 1-a 22 a 22 Model topology : Transition state Observation symbol probabilities :probabilities A = B = 4 parameters are required to define this model (A : 2, B : 2)

20 Copyright © IRCCyN/CVG 20 n Coin tossing experiment : model 3 : assume three biased coins 3 states (N = 3) Q = {q 1 = Coin1, q 2 = Coin2, q 3 = Coin3} 2 different observations (M = 2) X = {x 1 = H, x 2 = T} Model topology : q 2 = Coin2 q 1 = Coin1 q 3 = Coin3 a11 a12 a13 a22 a21 a23 a31 a32 a33 Observation symbol probabilities : B = 9 independent parameters are required to define this model (A : 6, B :3)

21 Copyright © IRCCyN/CVG 21 n Coin tossing experiment : model 3 : assume three biased coins 3 states (N = 3) Q = {q 1 = Coin1, q 2 = Coin2, q 3 = Coin3} - Observation probabilities - State transition probabilities : 1/3 - Initial state probabilities : 1/3 Consider the following example : Matrix A Vector  Matrix B

22 Copyright © IRCCyN/CVG 22 n Coin tossing experiment : model 3 : assume three biased coins 3 states (N = 3) Q = {q 1 = Coin1, q 2 = Coin2, q 3 = Coin3} 1. You observe X = (H,H,H,H,T,H,T,T,T,T) Which state sequence , generates most likely X ? What is the joint probability, P(X,  | ) of the observation sequence and the state sequence ? 2. What is the probability that the observation sequence came entirely from state q 1 ?

23 Copyright © IRCCyN/CVG You observe X = (H,H,H,H,T,H,T,T,T,T) 2. What is the probability that the observation sequence came entirely from state q 1 ?

24 Copyright © IRCCyN/CVG 24 Example of non ergodic model (left-right model) qsqs q1q1 qfqf q3q3 q2q Three states + one starting state q s + one final state q f q s and q f are non emitting states. Assume there are 2 symbols to observe X = {x 1 =a,x 2 =b} Initial stateTransition stateObservation symbol probabilitiesprobabilitiesprobabilities P(a|q 1 ) P(b|q 3 )

25 Copyright © IRCCyN/CVG 25 The most probable state sequence with this model is : q 2, q 3 resulting in the symbol sequence “bb”. But this sequence can also be generated by other state sequences, such as q 1, q 2. Computation of the likelihood of an observation sequence : Given X = “aaa” compute the likelihood for this model : P(aaa | ) The likelihood P(X | ) is given by the sum over all possible ways to generate X. qsqs q1q1 qfqf q3q3 q2q

26 Copyright © IRCCyN/CVG 26 S I R A P Using HMM for pattern recognition consists in computing the model i among a set of K models which maximizes the likehood for an observation to have been generated by this model : max = arg max P(X| i ) for i = 1, … K i Character recognition : Small lexicon : as many HMModels as words Otherwise, letters are individually modeled by an HMM which can be concatenated to form word models. Word model for « PARIS »

27 Copyright © IRCCyN/CVG 27 n Problem 1 : Recognition Given X = (x 1,x 2, … x T ) and the various models i How to efficiently compute P(X| ) ? n Problem 2 : Analysis Given X = (x 1,x 2, … x T ) and a model, find the optimal state sequence  How can we undiscovered the sequence of states corresponding to a given observation ? n Problem 3 : Learning Given X = (x 1,x 2, … x T ), estimate model parameters = ( , A, B) that maximize P(X| ) How do we adjust the model parameters = ( , A, B) ? The three basic problems for HMMs Forward-Backward algorithm Viterbi algorithm Baum-Welch algorithm

28 Copyright © IRCCyN/CVG 28 n Problem 1 : Recognition How to efficiently compute P(X| ) ? X = x 1 x 2 …x t … x T : observation sequence It exists several paths (  ) which allow to obtain X : P(X| ) =   P( X,  | ) =   P(X | , ) x P(  | ) Depends only on observation probabilities : B matrix Depends only on state transition probabilities : A P(X | , ) The path  is defined by a sequence of states : q 1 q 2 …q t … q T P(X | , )= P(x 1 x 2 …x t … x T | q 1 q 2 …q t … q T, ) = P(x 1 | q 1 …q T, )xP(x 2 | x 1,q 1 … q T, )…P(x T | x T-1 … x 1,q 1 … q T, ) = P(x 1 | q 1, )xP(x 2 | q 2, )…P(x T |q T, ) as x t depends only of q t Joint probability

29 Copyright © IRCCyN/CVG 29 P(  | ) The path  is defined by a sequence of states : q 1 q 2 …q t … q T P(  | )= P(q 1 q 2 …q t … q T | ) = P(q 1 | ) x P(q 2 |q 1, )…P(q T | q T-1 … q 1, ) = P(q 1 | ) x P(q 2 | q 1, )…P(q T |q T-1, ) n Problem 1 : Recognition (2) How to efficiently compute P(X| ) ? P(X| ) =   P( X,  | ) =   P(X | , ) x P(  | ) Joint probability as we assume a first order HMM At last, by replacing :

30 Copyright © IRCCyN/CVG 30 What about the computational complexity ? Number of multiplications for one path  : Number of paths X : Total number of multiplications : Total number of additions : How long does it take ? Assume N = 23, T = 15 (word check application) Number of operations  2T.N T =  Assume 1 Gops  Number of seconds = /10 9 =  Number of days = /3600x24 = 10 8  Number of years = 10 8 /365 = 10 5 FIND SOMETHING ELSE !!! (T-1)+1+1+(T-2) =2T-1 NTNT (2T-1)N T (N T -1)

31 Copyright © IRCCyN/CVG 31 Forward-Backward algorithm Use a trellis structure to carry out the computations : at each node of trellis, store the forward variable  t i with  t i = P(x 1 x 2 … x t, q t = q i | ) which is the probability of a partial observation sequence up to time t and of being in the state q i at that same time Algorithm in 3 steps : 1. Initialization  1 i = P(x 1, q 1 = q i | ) = P(x 1 | q 1 = q i, ) x P(q 1 = q i | ) 2. Recursion with 1  j  N, and 1  t  T-1 3. Termination

32 Copyright © IRCCyN/CVG 32 Forward-Backward algorithm 1...tt+1... a 1j t1t1 qN.qi.qj..q1qN.qi.qj..q1 tNtN titi a Nj a ij with 1  j  N, and 1  t  T-1 Total number of multiplications : Total number of additions : Assume N = 23, T = 15 Number of operations:

33 Copyright © IRCCyN/CVG 33 n Problem 2 : Analysis –How can we undiscovered the sequence of states corresponding to a given observation X ? –Choose the most likely path : Find the path (q 1,q 2,…,q T ) that maximizes the probability = P(q 1,q 2,…,q T | X, ) –Solution by Dynamic Programming : –It is an inductive algorithm that keep the best possible state sequence ending in state q i at time t. Viterbi algorithm

34 Copyright © IRCCyN/CVG 34 n VITERBI Algorithm –Define  t (i) = max P(q 1,q 2,…,q t =q i, x 1,x 2,…x t | ) q 1,q 2,…,q t-1 –  t (i) is the highest probability path ending in state q i –By induction, we have :  t+1 (k) = max [  t (i) a ik ]. b k (x t+1 ), with 1  k  N 1  i  N  Memorize also  t+1 (k) = arg max(  t (i) a ik ) 12t-1tt+1T-1T TIME x 1 x 2 x t-1 x t x t+1 x T-1 x T OBSERVATION Nkji21Nkji21 STATES a 1k a ik a Nk  t+1 (k)  t (i)  t+1 (k) = j Tracing back the optimal state sequence max [  T (i)] 1  i  N

35 Copyright © IRCCyN/CVG 35 n VITERBI Algorithm 1. Initialization For 1  i  N  1 (i) =  i x b i (x 1 ); // or -ln(  i )-ln(b i (x 1 ))  1 (i) = 0; 2. Recursive computation For 2  t  T For 1  j  N  t (j) = max [  t-1 (i) a ij ]. b j (x t );// or  t (j) = min[  t-1 (i)-ln( a ij )]-ln(b j (x t )); 1  i  N  t (j) = arg max(  t-1 (i) a ij );// or  t (j) = arg min[  t-1 (i)-ln(a ij )); 1  i  N 3. Termination P* = max[  t (i)];// or P* = min[  t (i)]; 1  i  N q* T = arg max[  T (i)];// or q* T = arg min[  T (i)]; 1  i  N 4. Backtracking For t=T-1 down to 1 q* t =  t (q* t+1 ); Hence P* (or exp(-P*) ) gives the required state-optimized probability,and  * = (q 1 *,q 2 *, …, q T *) is the optimal state sequence.

36 Copyright © IRCCyN/CVG 36 n Problem 3 : Learning How do we adjust the model parameters = ( , A, B) ? Baum-Welch algorithm 1. Let initial model be 0, 2. Compute new model based on 0 and on observation X, 3. If log P(X| ) - log(P(X| 0 ) < Delta stop, 4. Else set 0  and goto step 2.

37 Copyright © IRCCyN/CVG 37 Joint Probability P(a, b) = P(a | b) x P(b) = P(b | a) x P(a)

38 Copyright © IRCCyN/CVG 38 SECONDE PARTIE : RECONNAISSANCE DE L’ECRITURE MANUSCRITE ON-LINEOFF-LINE

39 Copyright © IRCCyN/CVG 39 OFF-LINE VERSUS ON-LINE Posé de stylo Lever de stylo Intersection Retracé

40 Copyright © IRCCyN/CVG 40 Image d’un mot HMM 2D 1D 2D 1D Modélisation graphe : Ordonnancement off-line OrdRec LettresGraphèmes Colonnes de pixels LA RECONNAISSANCE

41 Copyright © IRCCyN/CVG 41 UN MODELE APPROPRIE POUR UNE OPTIMISATION GLOBALE 1 DEFINITION D’UN NOEUD 2 GRAPHE ORIGINAL Problème du voyageur de commerce recherche d’un cycle hamiltonien Modèle lien inter lien intra 1 21  2  3  4  5  6  7  8  R2 R1 R3

42 Copyright © IRCCyN/CVG 42 UN MODELE APPROPRIE POUR UNE OPTIMISATION GLOBALE 1 DEFINITION D’UN NOEUD 2 GRAPHE ORIGINAL 3 GRAPHE COMPLET 5 VALUATION DU GRAPHE 4 GRAPHE FINAL 6 CHEMIN HAMILTONIEN 1 21  2  3  4  5  6  7  8  P Lien inter Lien intra Lien de complétude Lien de départ et d’arrivée

43 Copyright © IRCCyN/CVG 43 Feature Extraction Feature Extraction Symbol Mapping Symbol Mapping HMM rugby : ……… : …… Result Feature Vectors Observations : 1D sequence of symbols - State transition prob. - Observations symbol prob. - Initial state prob. Synoptique du Système de RECONNAISSANCE

44 Copyright © IRCCyN/CVG 44 Segmentation Ensemble de segments Séquence de segments orientés et de levers-posés de stylo Extraction de caractéristiques Quantification vectorielle Séquence de symboles … HMM Fichier on-line Graphe Image d’un mot Ordonnancement on-line Lignes de référence un deux frs Normalisation Ordonnancement off-line Vraisemblances pour chaque mot du dictionnaire SYSTEMESYSTEME RECONNAISSANCERECONNAISSANCE DEDE

45 Copyright © IRCCyN/CVG 45 Core line Number of clusters : 300 EXAMPLES of CLUSTERS

46 Copyright © IRCCyN/CVG 46 IRONOFF RESULTATS Une base de données duales on-line et off-line caractères isolés mots cursifs environ 700 scripteurs différents (64% d’hommes et 34% de femmes) âge moyen d’un scripteur : 26 ans et demi Dictionnaire :197 mots Apprentissage : mots Test : mots

47 Copyright © IRCCyN/CVG 47 Step 1 : online acquisition Step 2 : offline acquisition Step 3 : matching process Online data Gray level image IRONOFF Construction

48 Copyright © IRCCyN/CVG 48 IRONOFF Samples

49 Copyright © IRCCyN/CVG 49 COMPARAISON OffLineSeg et OnLineSeg APPROCHE « ORDREC » OffLineSeg

50 Copyright © IRCCyN/CVG 50 COMPARAISON OnLineSeg et OnLinePt

51 Copyright © IRCCyN/CVG 51 EXEMPLES Top(1) correct

52 Copyright © IRCCyN/CVG 52 EXEMPLES Top(1) correct

53 Copyright © IRCCyN/CVG 53 EXEMPLES Erreurs dues à un pré-traitement : normalisation du mot Erreurs d ’annotation : mot inconnu dans une certaine casse Erreurs dues aux limitations du modèle

54 Copyright © IRCCyN/CVG 54 EXEMPLES Erreurs dues à un pré-traitement : normalisation du mot Erreurs d ’annotation : mot inconnu dans une certaine casse Erreurs dues aux limitations du modèle

55 Copyright © IRCCyN/CVG 55 FIN

56 Copyright © IRCCyN/CVG 56 CONCATENATION DE MODELES LETTRES POUR DEFINIR DES MODELES MOTS cet état initial état final états non émetteurs APPROCHE “LEGO”

57 Copyright © IRCCyN/CVG 57 dia MODELE MOT : APPROCHE « LEGO » qDqD d ega dia t qFqF dégât 15 d é div_0 dia g div_+ dia â div_0 dia divDia_0 t div_F dia divDia_F dégât

58 Copyright © IRCCyN/CVG 58 APPROCHE « ORDREC » ¬ ­ ® ¬ ­ ® ¯ ¬ ­ ® ¬ ­ ® ¯ ¬­®¯ Posés de stylo et ordre des traits Posés et levers de stylo Levers de stylo Graphe des segments Graphe des traits

59 Copyright © IRCCyN/CVG 59 Dimensions of the segment enclosing rectangle : L x (s) et L y (s), Relative Ordinate to the baseline of the center of the enclosing rectangle : y(s), Segment orientation : D x (s) et D y (s), Left and right extensions of the segment : O g (s) et O d (s). Segment features Pen-up/pen-down features Ordinate of the vector center of the pen-up/pen-down : y(lp), Direction of the vector of the pen-up/pen-down : D x (lp) et D y (lp).


Download ppt "1 Christian VIARD-GAUDIN Séminaire IRCCyN 6 Janvier Institut de Recherche en Communications et Cybernétique de Nantes 2 0 00."

Similar presentations


Ads by Google