Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cadeias de Markov Escondidas Fevereiro 2007 Magnos Martinello Universidade Federal do Espírito Santo - UFES Departamento de Informática - DI Laboratório.

Similar presentations


Presentation on theme: "Cadeias de Markov Escondidas Fevereiro 2007 Magnos Martinello Universidade Federal do Espírito Santo - UFES Departamento de Informática - DI Laboratório."— Presentation transcript:

1 Cadeias de Markov Escondidas Fevereiro 2007 Magnos Martinello Universidade Federal do Espírito Santo - UFES Departamento de Informática - DI Laboratório de Pesquisas em Redes Multimidia - LPRM

2 Magnos Martinello – UFES História n Andrey (Andrei) Andreyevich Markov (Russian: Андрей Андреевич Марков) (June 14, 1856 N.S. – July 20, 1922) was a Russian mathematician. He is best known for his work on theory of stochastic processes. His research later became known as Markov chains.RussianJune 141856N.S. July 201922Russianmathematicianstochastic processes Markov chains n His son, another Andrey Andreevich Markov (1903- 1979), was also a notable mathematician.

3 Magnos Martinello – UFES Cadeia de Markov n In mathematics, a Markov chain, named after Andrey Markov, is a discrete-time or continuous-time stochastic process with the Markov property.mathematicsAndrey Markovdiscrete-time or continuous-time stochastic processMarkov property n A Markov chain is a series of states of a system that has the Markov property.Markov property n A series with the Markov property, is a sequence of states for which the conditional probability distribution of a state in the future can be deduced using only the current stateMarkov propertyconditional probability

4 Magnos Martinello – UFES Definição formal n A Markov chain is a sequence of random variables X1, X2, X3,... with the Markov property, namely that, given the present state, the future and past states are independent. Formally,random variablesMarkov property l n The possible values of Xi form a countable set S called the state space of the chain. Markov chains are often described by a directed graph, where the edges are labeled by the probabilities of going from one state to the other states.countable setdirected graph n A finite state machine is an example of a Markov chain.finite state machine

5 Magnos Martinello – UFES Propriedades n Reducibility : a Markov chain is said to be irreducible if its state space is a communicating class; this means that, in an irreducible Markov chain, it is possible to get to any state from any state n Periodicity : A state i has period k if any return to state i must occur in some multiple of k time steps and k is the largest number with this property. If k = 1, then the state is said to be aperiodic n Recurrence : A state i is said to be transient if, given that we start in state i, there is a non-zero probability that we will never return back to i. l If a state i is not transient (it has finite hitting time with probability 1), then it is said to be recurrent or persistent l A state i is called absorbing if it is impossible to leave this state n Ergodicity : A state i is said to be ergodic if it is aperiodic and positive recurrentergodic

6 Magnos Martinello – UFES Aplicações científicas n Markovian systems appear extensively in physics, particularly statistical mechanicsphysicsstatistical mechanics n Markov chains can also be used to model various processes in queueing theory and statistics. Claude Shannon's famous 1948 paper A mathematical theory of communication, which at a single step created the field of information theory, opens by introducing the concept of entropy (effective data compression through entropy coding techniques) through Markov modeling. They also allow effective state estimation and pattern recognition.queueing theorystatisticsClaude Shannon's1948A mathematical theory of communicationinformation theoryentropydata compressionentropy codingstate estimationpattern recognition

7 Magnos Martinello – UFES Aplicações científicas n The PageRank of a webpage as used by Google is defined by a Markov chain. It is the probability to be at page i in the stationary distribution on the following Markov chain on all (known) webpages.PageRankGoogle n Markov models have also been used to analyze web navigation behavior of users. A user's web link transition on a particular website can be modeled using first or second order Markov models n Markov chain methods have also become very important for generating sequences of random numbers to accurately reflect very complicated desired probability distributions - a process called Markov chain Monte Carlo or MCMC for short. In recent years this has revolutionised the practicability of Bayesian inference methods.Markov chain Monte CarloBayesian inference n Markov parody generator (Emacs, M-x dissociated-press )Emacs

8 Magnos Martinello – UFES Modelo de previsão de tempo n The probabilities of weather conditions, given the weather on the preceding day, can be represented by a transition matrix:transition matrix l n Pij is the probability that, if a given day is of type i, it will be followed by a day of type j. n Note that the rows of P sum to 1: this is because P is a stochastic matrix.stochastic matrix

9 Magnos Martinello – UFES Prevendo o tempo n The weather on day 0 is known to be sunny. This is represented by a vector in which the "sunny" entry is 100%, and the "rainy" entry is 0%: l n The weather on day 1 can be predicted by: l n Thus, there is an 90% chance that day 1 will also be sunny. n The weather on day 2 can be predicted in the same way: l n General rules for day n are: l

10 Magnos Martinello – UFES Regime estacionário n In this example, predictions for the weather on more distant days are increasingly inaccurate and tend towards a steady state vector.steady state vector n The steady state vector is defined as: n Since the q is independent from initial conditions, it must be unchanged when transformed by P.

11 Magnos Martinello – UFES Regime estacionário So − 0.1q1 + 0.5q2 = 0

12 Magnos Martinello – UFES Conclusão n Since they are a probability vector we know that l q1 + q2 = 1. n Solving this pair of simultaneous equations gives the steady state distribution: n In conclusion, in the long term, 83% of days are sunny. n For the most prolific example of the use of Markov chains, see Google. A description behind the page rank algorithm, which is basically a Markov chain over the graph of the Internet, can be found in the seminal paper, "The Page Rank Citation Ranking: Bringing Order to the Web" by Larry Page, Sergey Brin, R. Motwani, and T. Winograd. Google"The Page Rank Citation Ranking: Bringing Order to the Web"

13 Magnos Martinello – UFES Definição n A hidden Markov model (HMM) is a statistical model in which the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge is to determine the hidden parameters from the observable parameters. The extracted model parameters can then be used to perform further analysis, for example for pattern recognition applications. A HMM can be considered as the simplest dynamic Bayesian network.statistical model Markov processobservablepattern recognitiondynamic Bayesian network

14 Magnos Martinello – UFES Cadeia de Markov escondida State transitions in a hidden Markov model (example) x — hidden states y — observable outputs a — transition probabilities b — output probabilities

15 Magnos Martinello – UFES Intuição/applicação n In a regular Markov model, the state is directly visible to the observer, and therefore the state transition probabilities are the only parameters. In a hidden Markov model, the state is not directly visible, but variables influenced by the state are visible. n Hidden Markov models are especially known for their application in temporal pattern recognition such as speech, handwriting, gesture recognition, musical score following and bioinformatics.temporalbioinformatics

16 Magnos Martinello – UFES HMMs and their Usage n HMMs are very common in Computational Linguistics: l Speech recognition (observed: acoustic signal, hidden: words) l Handwriting recognition (observed: image, hidden: words) l Machine translation (observed: foreign words, hidden: words in target language)

17 Magnos Martinello – UFES Architecture of a Hidden Markov Model n The diagram below shows the general architecture of an HMM. Each oval shape represents a random variable that can adopt a number of values. The random variable x(t) is the value of the hidden variable at time t. The random variable y(t) is the value of the observed variable at time t. The arrows in the diagram denote conditional dependencies. n From the diagram, it is clear that the value of the hidden variable x(t) (at time t) only depends on the value of the hidden variable x(t − 1) (at time t − 1). This is called the Markov property. Similarly, the value of the observed variable y(t) only depends on the value of the hidden variable x(t) (both at time t).Markov property

18 Magnos Martinello – UFES Madame moderna caprichosa n Assume you have a friend who lives far away and to whom you talk daily over the telephone. n Your friend is only interested in three activities: walking in the park, shopping, and cleaning his apartment. n The choice of what to do is determined exclusively by the weather on a given day. n Based on what she tells you she did each day, you try to guess what the weather must have been like.

19 Magnos Martinello – UFES Madame moderna caprichosa n You believe that the weather operates as a discrete Markov chain. There are two states, "Rainy" and "Sunny", but you cannot observe them directly, that is, they are hidden from you.Markov chain n On each day, there is a certain chance that your friend will perform one of the following activities, depending on the weather: "walk", "shop", or "clean". Since your friend tells you about her activities, those are the observations. n The entire system is that of a hidden Markov model (HMM). n You know the general weather trends in the area, and what your friend likes to do on average. In other words, the parameters of the HMM are known

20 Magnos Martinello – UFES Probability of an observed sequence n The probability of observing a sequence Y = y(0),y(1),...,y(L − 1) of length L is given by: n n where the sum runs over all possible hidden node sequences X = x(0),x(1),...,x(L − 1). A brute force calculation of P(Y) is intractable for realistic problems, as the number of possible hidden node sequences typically is extremely high. The calculation can however be sped up enormously using an algorithm called the forward-backward procedure.

21 Magnos Martinello – UFES Using Hidden Markov Models n There are three canonical problems associated with HMMs:canonical n Given the parameters of the model, compute the probability of a particular output sequence. This problem is solved by the forward-backward algorithm. forward-backward algorithm n Given the parameters of the model, find the most likely sequence of hidden states that could have generated a given output sequence. This problem is solved by the Viterbi algorithm.Viterbi algorithm n Given an output sequence or a set of such sequences, find the most likely set of state transition and output probabilities. In other words, train the parameters of the HMM given a dataset of sequences. This problem is solved by the Baum-Welch algorithm.Baum-Welch algorithm


Download ppt "Cadeias de Markov Escondidas Fevereiro 2007 Magnos Martinello Universidade Federal do Espírito Santo - UFES Departamento de Informática - DI Laboratório."

Similar presentations


Ads by Google