CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 21.

Slides:



Advertisements
Similar presentations
Eugene W.Myers and Webb Miller. Outline Introduction Gotoh's algorithm O(N) space Gotoh's algorithm Main algorithm Implementation Conclusion.
Advertisements

1 Chapter 6 Dynamic Programming Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Overview What is Dynamic Programming? A Sequence of 4 Steps
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 20.
DYNAMIC PROGRAMMING. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, myopically optimizing some local criterion. Divide-and-conquer.
Sequence Alignment Algorithms in Computational Biology Spring 2006 Edited by Itai Sharon Most slides have been created and edited by Nir Friedman, Dan.
CSE 421 Algorithms Richard Anderson Lecture 19 Longest Common Subsequence.
Introduction To Bioinformatics Tutorial 2. Local Alignment Tutorial 2.
Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology.
CSE 421 Algorithms Richard Anderson Lecture 19 Longest Common Subsequence.
Sequence similarity. Motivation Same gene, or similar gene Suffix of A similar to prefix of B? Suffix of A similar to prefix of B..Z? Longest similar.
Dynamic Programming. Pairwise Alignment Needleman - Wunsch Global Alignment Smith - Waterman Local Alignment.
Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey.
Recap Don’t forget to – pick a paper and – me See the schedule to see what’s taken –
1 Theory I Algorithm Design and Analysis (11 - Edit distance and approximate string matching) Prof. Dr. Th. Ottmann.
Dynamic Programming Adapted from Introduction and Algorithms by Kleinberg and Tardos.
1 Exact Set Matching Charles Yan Exact Set Matching Goal: To find all occurrences in text T of any pattern in a set of patterns P={p 1,p 2,…,p.
BIOMETRICS Module Code: CA641 Week 11- Pairwise Sequence Alignment.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 22.
1 Algorithmic Paradigms Greed. Build up a solution incrementally, myopically optimizing some local criterion. Divide-and-conquer. Break up a problem into.
Dynamic programming 叶德仕
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
Final Exam Review Final exam will have the similar format and requirements as Mid-term exam: Closed book, no computer, no smartphone Calculator is Ok Final.
Alignment, Part I Vasileios Hatzivassiloglou University of Texas at Dallas.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 8.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 17.
Chapter 3 Computational Molecular Biology Michael Smith
Minimum Edit Distance Definition of Minimum Edit Distance.
CS 312: Algorithm Design & Analysis Lecture #24: Optimality, Gene Sequence Alignment This work is licensed under a Creative Commons Attribution-Share Alike.
1 CPSC 320: Intermediate Algorithm Design and Analysis July 28, 2014.
Prof. Swarat Chaudhuri COMP 482: Design and Analysis of Algorithms Spring 2012 Lecture 17.
Algorithms for Generalized Comparison of Minisatellites Behshad Behzadi & Jean-Marc Steyaert LIX, Ecole Polytechnique France.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Prof. Swarat Chaudhuri COMP 482: Design and Analysis of Algorithms Spring 2012 Lecture 16.
1 Chapter 6 Dynamic Programming. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, optimizing some local criterion. Divide-and-conquer.
1 Chapter 6 Dynamic Programming Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
1 Chapter 6-1 Dynamic Programming Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Dynamic Programming: Edit Distance
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 12.
Sequence Alignment Tanya Berger-Wolf CS502: Algorithms in Computational Biology January 25, 2011.
Chapter 6 Dynamic Programming
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 25.
Dynamic Programming.  Decomposes a problem into a series of sub- problems  Builds up correct solutions to larger and larger sub- problems  Examples.
Instructor Neelima Gupta Instructor: Ms. Neelima Gupta.
Doug Raiford Phage class: introduction to sequence databases.
CS38 Introduction to Algorithms Lecture 10 May 1, 2014.
1 Chapter 6 Dynamic Programming Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 18.
CSCI 256 Data Structures and Algorithm Analysis Lecture 16 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
Lab 6 Problem 1: DNA. DNA Given a string with length N, determine the number of occurrences of some given substrings (with length K) in that string. For.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 3.
Suffix Tree 6 Mar MinKoo Seo. Contents  Basic Text Searching  Introduction to Suffix Tree  Suffix Trees and Exact Matching  Longest Common Substring.
9/27/10 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Adam Smith Algorithm Design and Analysis L ECTURE 16 Dynamic.
TU/e Algorithms (2IL15) – Lecture 4 1 DYNAMIC PROGRAMMING II
Example 2 You are traveling by a canoe down a river and there are n trading posts along the way. Before starting your journey, you are given for each 1
CS502: Algorithms in Computational Biology
Chapter 6 Dynamic Programming
Dynamic programming 叶德仕
Chapter 6 Dynamic Programming
Richard Anderson Lecture 19 Longest Common Subsequence
CSE 589 Applied Algorithms Spring 1999
Chapter 6 Dynamic Programming
Chapter 6 Dynamic Programming
Dynamic Programming II DP over Intervals
Richard Anderson Lecture 18 Dynamic Programming
Data Structures and Algorithm Analysis Lecture 15
Algorithm Course Dr. Aref Rashad
Presentation transcript:

CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 21

Dynamic Programming Over Intervals Write down final recurrence (DONE IN CLASS) What order to solve the sub-problems? –Do shortest intervals first –Running time: O(n 3 ) –Example: ACCGGUAGU (DONE IN CLASS) RNA(b 1,…,b n ) { Initialize Opt[i, j] = 0 whenever i  j-4 for k = 5, 6, …, n-1 for i = 1, 2, …, n-k j = i + k Compute Opt[i, j] return Opt[1, n] } using recurrence

String Similarity Consider a dictionary interface or a spell checker. How similar are two strings? –ocurrance –occurrence ocurrance ccurrenceo - 5 mismatches, 1 gap ocurrance ccurrenceo - 1 mismatch, 1 gap ocurrnce ccurrnceo --a e- 0 mismatches, 3 gaps

String Similarity Dictionary interfaces and spell checkers not the most computationally intensive application for this type of problem Determining similarities among strings is one of the central computational problems facing molecular biologists today –Strings arise very naturally in biology (e.g., an organism’s genome is divided up into giant linear DNA molecules known as chromosomes, think of a chromosome as an enormous linear tape containing a string over the alphabet {A, C, G, T}) –A certain substring in the DNA of some organism may code for a certain kind of toxin. If we discover a very “similar” substring in the DNA of another organism, might be able to hypothesize without any experimentation that it codes for similar toxin

Edit Distance Applications –Basis for Unix diff –Speech recognition –Computational biology Edit distance [Levenshtein 1966, Needleman-Wunsch 1970] –Gap penalty  ; mismatch penalty  pq –Cost = sum of gap and mismatch penalties 2  +  CA CGACCTACCT CTGACTACAT TGACCTACCT CTGACTACAT - T C C C  TC +  GT +  AG + 2  CA -

Sequence Alignment Goal: Given two strings X = x 1 x 2... x m and Y = y 1 y 2... y n find alignment of minimum cost Def: An alignment M is a set of ordered pairs x i - y j such that each item occurs in at most one pair and no crossings Ex: CTACCG vs. TACATG (DONE IN CLASS)

Sequence Alignment: Problem Structure In the optimal alignment M of X = x 1 x 2... x m and Y = y 1 y 2... y n, either (x m, y n )  M or (x m, y n )  M. That is, either the last two symbols in the two strings are matched to each other, or they aren’t. By itself, is this fact enough to provide us with a DP solution? –In the optimal alignment M of X = x 1 x 2... x m and Y = y 1 y 2... y n. If (x m, y n )  M, then either x m or y n is not matched in M –Proof (DONE IN CLASS)

Sequence Alignment: Problem Structure In an optimal alignment M of X = x 1 x 2... x m and Y = y 1 y 2... y n, at least one of the following is true –(x m, y n )  M; or –X m is not matched; or –y n is not matched Def: OPT(i, j) = min cost of aligning strings x 1 x 2... x i and y 1 y 2... y j. Write down final recurrence (DONE IN CLASS)

Sequence Alignment: Algorithm Analysis:  (mn) time and space –English words or sentences: m, n  10 –Computational biology: m = n = 100, billions ops OK, but 10GB array? Sequence-Alignment(m, n, x 1 x 2...x m, y 1 y 2...y n, ,  ) { for i = 0 to m Opt[0, i] = i  for j = 0 to n Opt[j, 0] = j  for i = 1 to m for j = 1 to n Opt[i, j] = min(  [x i, y j ] + Opt[i-1, j-1],  + Opt[i-1, j],  + Opt[i, j-1]) return Opt[m, n] }