Download presentation

Presentation is loading. Please wait.

Published byGuadalupe Harbinson Modified about 1 year ago

1
Greedy Algorithms CS 6030 by Savitha Parur Venkitachalam

2
Outline Greedy approach to Motif searching Genome rearrangements Sorting by Reversals Greedy algorithms for sorting by reversals Approximation algorithms Breakpoint Reversal sort

3
Greedy motif searching Developed by Gerald Hertz and Gary Stormo in 1989 CONSENSUS is the tool based on greedy algorithm Faster than Brute force and Simple motif search algorithms An approximation algorithm with an unknown approximation ratio

4
Greedy motif search – Psuedocode

5
Greedy motif search – Steps Input – DNA Sequence, t (# sequences), n (length of one sequence), l (length of motif to search) Output – set of starting points of l-mers Performs an exhaustive search using hamming distance on first two sequences of the DNA Forms a 2 x l seed matrix with the two closest l-mers Scans the rest of t-2 sequences to find the l-mer that best matches the seed and add it to the next row of the seed matrix

6
Complexity Exhaustive search on first two sequences require l(n-l+1) 2 operations which is O(ln 2 ) The sequential scan on t-2 sequences requires l(n-l+1)(t-2) operations which is O(lnt) Thus running time of greedy motif search is O(ln 2 + lnt) If t is small compared to n algorithm behaves O(ln 2 )

7
Consensus tool Greedy motif algorithm may miss the optimal motif Consensus tool saves large number of seed matrices Consensus tool can check sequences in random Consensus tool is less likely to miss the optimal motif

8
Genome rearrangements Gene rearrangements results in a change of gene ordering Series of gene rearrangements can alter genomic architecture of a species 99% similarity between cabbage and turnip genes Fewer than 250 genomic rearrangements since divergence of human and mice

9

10

11
History of Chromosome X Rat Consortium, Nature, 2004

12
Types of Rearrangements Reversal Translocation Fusion Fission

13
Greedy algorithms in Gene Rearrangements Biologists are interested in finding the smallest number of reversals in an evolutionary sequence gives a lower bound on the number of rearrangements and the similarity between two species Two greedy algorithms used - Simple reversal sort - Breakpoint reversal sort

14
Gene Order Gene order is represented by a permutation 1 i-1 i i j-1 j j n Reversal ( i, j ) reverses (flips) the elements from i to j in ( i, j ) ↓ 1 i-1 j j i+1 i j n

15
Reversal example = (3,5) ↓ (5,6) ↓

16
Reversal distance problem Goal: Given two permutations, find the shortest series of reversals that transforms one into another Input: Permutations and Output: A series of reversals 1,… t transforming into such that t is minimum t - reversal distance between and d( , ) - smallest possible value of t, given and

17
Sorting by reversal Goal : Given a permutation, find a shortest series of reversals that transforms it into the identity permutation. Input: Permutation π Output : A series of reversals 1,… t transforming into identity permutation, such that t is minimum

18
Sorting by reversal - Greedy algorithm If sorting permutation = , the first three elements are already in order so it does not make any sense to break them. The length of the already sorted prefix of is denoted prefix( ) – prefix( ) = 3 This results in an idea for a greedy algorithm: increase prefix( ) at every step

19
Simple Reversal sort – Psuedocode A very generalized approach leads to analgorithm that sorts by moving ith element to ith position SimpleReversalSort( ) 1 for i 1 to n – 1 2 j position of element i in (i.e., j = i) 3 if j ≠i 4 * (i, j) 5 output 6 if is the identity permutation 7 return

20
Example – SimpleReversalSort not optimal Input – > > > > > Greedy SimpleReversalSort takes 5 steps where as optimal solution only takes 2 steps > > An example of SimpleReversalSort is ‘Pancake Flipping problem’

21
Approximation Ratio These algorithms produce approximate solution rather than an optimal one Approximation ratio is of an algorithm A is given by A( ) / OPT( ) – For algorithm A that minimizes objective function (minimization algorithm): max | | = n A( ) / OPT( ) – For maximization algorithm: min | | = n A( ) / OPT( )

22
Breakpoints – A different face of greed In a permutation = 1 ---- n - if i and i+1 are consecutive numbers it is an adjacency - if i and i+1 are not consecutive numbers it is a breakpoint Example: = 1 | 9 | 3 4 | 7 8 | 2 | 6 5 Pairs (1,9), (9,3), (4,7), (8,2) and (2,6) form breakpoints Pairs (3,4) (7,8) and (6,5) form adjacencies b( ) - # breakpoints in permutation p Our goal is to eliminate all breakpoints and thus forming the identity permutation

23
Breakpoint Reversal Sort – Steps Put two elements 0 =0 and n + 1 =n+1 at the ends of Eliminate breakpoints using reversals Each reversal eliminates at most 2 breakpoints This implies reversal distance ≥ #breakpoints/2 = b( ) = b( ) = b( ) = b( ) = 0 Not efficient as it may run forever

24
Psuedocode – Breakpoint reversal Sort BreakPointReversalSort( ) 1 while b( ) > 0 2 Among all possible reversals, choose reversal minimizing b( ) 3 (i, j) 4 output 5 return

25
Using strips A strip is an interval between two consecutive breakpoints in a permutation Decreasing strip: strip of elements in decreasing order Increasing strip: strip of elements in increasing order A single-element strip can be declared either increasing or decreasing. We will choose to declare them as decreasing with exception of the strips with 0 and n+1

26
Reducing breakpoints Choose the decreasing strip with the smallest element k in Find K-1 in the permutation Reverse the segment between k and k-1 Eg: = b( ) = b( ) = b( ) =

27
ImprovedBreakpointReversalSort Sometimes permutation may not contain any decreasing strips So an increasing strip has to be reversed so that it becomes a decreasing strip Taking this into consideration we have an improved algorithm ImprovedBreakpointReversalSort( ) 1 while b( ) > 0 2 if has a decreasing strip 3 Among all possible reversals, choose reversal that minimizes b( ) 4 else 5 Choose a reversal that flips an increasing strip in 6 7 output 8 return

28
Example – ImprovedBreakPointSort There are no decreasing strips in , for: = | | 3 4 | 8 b( ) = 3 (6,7) = | | 4 3 | 8 b( ) = 3 (6,7) does not change the # of breakpoints (6,7) creates a decreasing strip thus guaranteeing that the next step will decrease the # of breakpoints.

29
Approximation Ratio - ImprovedBreakpointReversalSort Approximation ratio is 4 – It eliminates at least one breakpoint in every two steps; at most 2b( ) steps – Approximation ratio: 2b( ) / d( ) – Optimal algorithm eliminates at most 2 breakpoints in every step: d( ) b( ) / 2 – Performance guarantee: ( 2b( ) / d( ) ) [ 2b( ) / (b( ) / 2) ] = 4

30
References An Introduction to Bioinformatics Algorithms - Neil C.Jones and Pavel A.Pevzner Ch5 Ch5

31
Questions

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google