Download presentation

1
**Prokaryotic Gene Structure**

Prokaryotic genes have a simple one-dimensional structure 5’→ 5’→ ←5’ ←5’ ATG AAA ATG GCA . . . GCA TTG CTA TAG Start codon Stop codon Note that the ATG codon encodes both start and methionine

2
**Prokaryotic Gene Structure**

Prokaryotic gene prediction begins with ORF finding Possible start ATG AAA GCA Alternate start . . . GCA TTG CTA TAG Stop codon Because of the possibility of alternate start sites, it’s not unusual for several ORFs to share a common stop codon An ORF finder needs to be able to find overlapping ORFs, whether they end with the same stop codon, or overlap in a different frame

3
**Prokaryotic Gene Structure**

Prokaryotic gene prediction begins with ORF finding 'ATG(...)*?(TAA|TAG|TGA)' A regular expression crafted to find ORFs must also exhibit “non-greedy” behaviour Note that many bacteria also employ rarer alternate start codons, most commonly GTG and TTG. But we’ll pretend this doesn’t happen!

4
**Higher Order Markov Chains**

We don’t need to always just consider the most recent state An nth order Markov process is a stochastic process where the probabilities associated with an event depend on the previous n events in the state path 𝑷 𝒙 𝒊 𝒙 𝒊−𝟏 , 𝒙 𝒊−𝟐 ,…, 𝒙 𝟏 =𝑷( 𝒙 𝒊 | 𝒙 𝒊−𝟏 ,…, 𝒙 𝒊−𝒏 ) So far all the Markov models we have seen so far have been of order 1 In the case of a first order process this statement reduces to our statement of the Markov property

5
**𝑷( 𝒙 𝒊 | 𝒙 𝒊−𝟏 ,…, 𝒙 𝒊−𝒏 ) = 𝑷 𝒙 𝒊 , 𝒙 𝒊−𝟏 ,…, 𝒙 𝒊−𝒏+𝟏 𝒙 𝒊−𝟏 ,…, 𝒙 𝒊−𝒏**

Higher Order Markov Chains Higher order models have an equivalent first order model An nth order Markov chain over alphabet A is equivalent to a first order Markov chain over the alphabet An of n-tuples… 𝑷( 𝒙 𝒊 | 𝒙 𝒊−𝟏 ,…, 𝒙 𝒊−𝒏 ) = 𝑷 𝒙 𝒊 , 𝒙 𝒊−𝟏 ,…, 𝒙 𝒊−𝒏+𝟏 𝒙 𝒊−𝟏 ,…, 𝒙 𝒊−𝒏 Practically, this says we can implement a higher order model just by expanding the alphabet size of a first order model This follows trivially from P(X,Y|Y) = P(X|Y)

6
**Higher Order Markov Chains**

Consider this first order Markov process A = {A, B} A S e B Here the alphabet A (our set of states) consists of just A and B How would we convert this to a second order Markov process?

7
**Higher Order Markov Chains Now reconfigured as a second order model**

AA AB BB BA A = {AA, AB, BA, BB} Note how we have disallowed certain transitions (i.e. set their probability to zero). Start and End omitted for clarity

8
**1 2 3 Inhomogeneous Markov Chains ATG GTC AAA GCA**

A Markov model of genes should model codon statistics ATG GTC AAA GCA In true coding genes, each of the three positions within a codon will be statistically distinct This can be accomplished in a few different ways…

9
**1 2 3 Inhomogeneous Markov Chains**

A Markov model of genes should model codon statistics ATG GTC AAA GCA 𝒂 𝒙 𝟏 𝒙 𝟐 𝟏 𝒂 𝒙 𝟐 𝒙 𝟑 𝟐 𝒂 𝒙 𝟑 𝒙 𝟒 𝟑 𝒂 𝒙 𝟒 𝒙 𝟓 𝟏 𝒂 𝒙 𝟓 𝒙 𝟔 𝟐 𝒂 𝒙 𝟔 𝒙 𝟕 𝟑 One idea is to intersperse three different Markov chains in alternating fashion This can also be recast as an HMM with additional states in obvious way

10
**Histograms with matplot lib**

We’ll use this to look at log-odds per NT distributions import numpy as np import pylab as P . . . # probs here should be your list of probabilities # 50 here corresponds to the number of desired n, bins, patches = P.hist(probs, 50, normed=1, histtype='stepfilled') P.setp(patches, 'facecolor', 'g', 'alpha', 0.75) P.show() More histogram examples may be found at: matplotlib.org/examples/pylab_examples/histogram_demo_extended.html

Similar presentations

Presentation is loading. Please wait....

OK

Markov models and applications

Markov models and applications

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on institute management system Ppt on trade fair 2016 Well made play ppt on apple Convert pdf ppt to ppt online training Ppt on time management for employees Ppt on image compression using neural network Ppt on open source technology Ppt on rainy season in india Pdf to ppt online convert Web technology books free download ppt on pollution