Sequence comparison: Local alignment

Slides:



Advertisements
Similar presentations
Alignment methods Introduction to global and local sequence alignment methods Global : Needleman-Wunch Local : Smith-Waterman Database Search BLAST FASTA.
Advertisements

BLAST, PSI-BLAST and position- specific scoring matrices Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and.
Sequence comparison: Introduction and motivation Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Pairwise Sequence Alignment
Definitions Optimal alignment - one that exhibits the most correspondences. It is the alignment with the highest score. May or may not be biologically.
©CMBI 2005 Sequence Alignment In phylogeny one wants to line up residues that came from a common ancestor. For information transfer one wants to line up.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2005.
DNA Alignment. Dynamic Programming R. Bellman ~ 1950.
Reminder -Structure of a genome Human 3x10 9 bp Genome: ~30,000 genes ~200,000 exons ~23 Mb coding ~15 Mb noncoding pre-mRNA transcription splicing translation.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2004.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 20, 2003.
Developing Sequence Alignment Algorithms in C++ Dr. Nancy Warter-Perez May 21, 2002.
Dynamic Programming. Pairwise Alignment Needleman - Wunsch Global Alignment Smith - Waterman Local Alignment.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 10, 2005.
Alignment methods II April 24, 2007 Learning objectives- 1) Understand how Global alignment program works using the longest common subsequence method.
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Sequence comparison: Local alignment
TM Biological Sequence Comparison / Database Homology Searching Aoife McLysaght Summer Intern, Compaq Computer Corporation Ballybrit Business Park, Galway,
Sequence comparison: Local alignment Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Developing Pairwise Sequence Alignment Algorithms
Bioiformatics I Fall Dynamic programming algorithm: pairwise comparisons.
Sequence Analysis Alignments dot-plots scoring scheme Substitution matrices Search algorithms (BLAST)
Traceback and local alignment Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington.
Pairwise & Multiple sequence alignments
Content of the previous class Introduction The evolutionary basis of sequence alignment The Modular Nature of proteins.
Pairwise Sequence Alignment. The most important class of bioinformatics tools – pairwise alignment of DNA and protein seqs. alignment 1alignment 2 Seq.
Pairwise Sequence Alignment BMI/CS 776 Mark Craven January 2002.
Lecture 6. Pairwise Local Alignment and Database Search Csc 487/687 Computing for bioinformatics.
Sequence comparison: Dynamic programming Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
Genome Revolution: COMPSCI 004G 8.1 BLAST l What is BLAST? What is it good for?  Basic.
Techniques for Protein Sequence Alignment and Database Searching G P S Raghava Scientist & Head Bioinformatics Centre, Institute of Microbial Technology,
Learning to Align: a Statistical Approach
Pairwise sequence comparison
INTRODUCTION TO BIOINFORMATICS
Sequence comparison: Dynamic programming
Welcome to Introduction to Bioinformatics
Sequence comparison: Local alignment
Conditionals (if-then-else)
While loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Sequence comparison: Significance of similarity scores
Sequence comparison: Traceback and local alignment
Motif p-values GENOME 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Sequence comparison: Multiple testing correction
GENOME 559: Introduction to statistical and computational genomics
Global, local, repeated and overlaping
Sequence Alignment 11/24/2018.
BNFO 236 Smith Waterman alignment
Sequence comparison: Dynamic programming
Pairwise sequence Alignment.
#7 Still more DP, Scoring Matrices
Pairwise Sequence Alignment
Lecture 14 Algorithm Analysis
Sequence alignment BI420 – Introduction to Bioinformatics
While loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Sequence comparison: Traceback
BCB 444/544 Lecture 7 #7_Sept5 Global vs Local Alignment
Find the Best Alignment For These Two Sequences
Pairwise Alignment Global & local alignment
Sequence Alignment Algorithms Morten Nielsen BioSys, DTU
While loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Sequence comparison: Significance of similarity scores
Dynamic Programming Finds the Best Score and the Corresponding Alignment O Alignment: Start in lower right corner and work backwards:
False discovery rate estimation
Global vs Local Alignment
Sequence comparison: Introduction and motivation
Sequence alignment, E-value & Extreme value distribution
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Sequence comparison: Local alignment Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble Notes from 2009: This lecture is very light on new content, especially because I had basically explained traceback in the last class. On the other hand, few students complained.

Questions and concerns Would like more examples of DP matrices. I was wondering if there is a way to practice using the operations we were shown in class? Maybe online practice sets? Ideally, the homework will provide you with this opportunity. When do you want to generate string versus list? A string is similar to a tuple of characters, but strings have lots of string-specific operations defined on them. Could you clarify whether something input to a command line is a list or string? sys.argv is a list of strings. I didn’t realize we’d be learning things today that were needed for the homework. Maybe work one problem together before additional ones individually to learn how to approach the problem.

Could we have more clarification on the del operator? The del operator deletes one object from a list. The object to be deleted is specified by index. >>> myList = [1, 2, 3] >>> myList [1, 2, 3] >>> del myList[1] [1, 3] It is an error to delete beyond the end of the list. >>> del myList[3] Traceback (most recent call last):   File "<stdin>", line 1, in <module> IndexError: list assignment index out of range The remove operator looks for and removes a given object in the list. >>> myList.remove(3) >>> myList [1] It is an error to try to remove an object that is not there. >>> myList.remove(2) Traceback (most recent call last):   File "<stdin>", line 1, in <module> ValueError: list.remove(x): x not in list

Things folks liked I liked the longer time dedicated to Python for this lecture. Good examples of inputs and results. I appreciate you going back to the previous lecture material. Problems in class were useful. The sample questions today were helpful. I like the pace and being able to work on problems in class.

Local alignment A single-domain protein may be homologous to a region within a multi-domain protein. Usually, an alignment that spans the complete length of both sequences is not required.

BLAST allows local alignments Global alignment Local alignment

Global alignment DP Align sequence x and y. F is the DP matrix; s is the substitution matrix; d is the linear gap penalty.

Local alignment DP Align sequence x and y. F is the DP matrix; s is the substitution matrix; d is the linear gap penalty.

Local DP in equation form

A simple example Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G C

A simple example Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G C

A simple example Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G C 2 -5 -5

A simple example Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G 2 ? C

A simple example Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G 2 ? C

A simple example Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G 2 4 C

Local alignment Two differences with respect to global alignment: No score is negative. Traceback begins at the highest score in the matrix and continues until you reach 0. Global alignment algorithm: Needleman-Wunsch. Local alignment algorithm: Smith-Waterman.

A simple example Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G 2 4 C AG

Local alignment Find the optimal local alignment of AAG and GAAGGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G C

Local alignment Find the optimal local alignment of AAG and GAAGGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G 2 4 6 C

Local alignment AAG A G 2 4 6 C Find the optimal local alignment of AAG and GAAGGC. Use a gap penalty of d=-5. A C G T 2 -7 -5 A G 2 4 6 C AAG

Summary Local alignment finds the best match between subsequences. Smith-Waterman local alignment algorithm: No score is negative. Trace back from the largest score in the matrix.