Sequence comparison: Local alignment Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.

Slides:



Advertisements
Similar presentations
Sequence comparison: Dynamic programming Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Advertisements

Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Alignment methods Introduction to global and local sequence alignment methods Global : Needleman-Wunch Local : Smith-Waterman Database Search BLAST FASTA.
Sequence allignement 1 Chitta Baral. Sequences and Sequence allignment Two main kind of sequences –Sequence of base pairs in DNA molecules (A+T+C+G)*
BLAST Sequence alignment, E-value & Extreme value distribution.
BLAST, PSI-BLAST and position- specific scoring matrices Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and.
Sequence comparison: Introduction and motivation Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Lecture 8 Alignment of pairs of sequence Local and global alignment
Pairwise Sequence Alignment
Definitions Optimal alignment - one that exhibits the most correspondences. It is the alignment with the highest score. May or may not be biologically.
6/11/2015 © Bud Mishra, 2001 L7-1 Lecture #7: Local Alignment Computational Biology Lecture #7: Local Alignment Bud Mishra Professor of Computer Science.
Sequence Similarity Searching Class 4 March 2010.
©CMBI 2005 Sequence Alignment In phylogeny one wants to line up residues that came from a common ancestor. For information transfer one wants to line up.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2005.
DNA Alignment. Dynamic Programming R. Bellman ~ 1950.
C T C G T A GTCTGTCT Find the Best Alignment For These Two Sequences Score: Match = 1 Mismatch = 0 Gap = -1.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2004.
Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 20, 2003.
Developing Sequence Alignment Algorithms in C++ Dr. Nancy Warter-Perez May 21, 2002.
Dynamic Programming. Pairwise Alignment Needleman - Wunsch Global Alignment Smith - Waterman Local Alignment.
Protein Sequence Comparison Patrice Koehl
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 10, 2005.
Sequence alignment, E-value & Extreme value distribution
Alignment methods II April 24, 2007 Learning objectives- 1) Understand how Global alignment program works using the longest common subsequence method.
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Sequence comparison: Local alignment
TM Biological Sequence Comparison / Database Homology Searching Aoife McLysaght Summer Intern, Compaq Computer Corporation Ballybrit Business Park, Galway,
Developing Pairwise Sequence Alignment Algorithms
Presented by Liu Qi Pairwise Sequence Alignment. Presented By Liu Qi Why align sequences? Functional predictions based on identifying homologues. Assumes:
Bioiformatics I Fall Dynamic programming algorithm: pairwise comparisons.
Traceback and local alignment Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington.
Pairwise & Multiple sequence alignments
Content of the previous class Introduction The evolutionary basis of sequence alignment The Modular Nature of proteins.
Pairwise Sequence Alignment. The most important class of bioinformatics tools – pairwise alignment of DNA and protein seqs. alignment 1alignment 2 Seq.
Pairwise Sequence Alignment BMI/CS 776 Mark Craven January 2002.
Lecture 6. Pairwise Local Alignment and Database Search Csc 487/687 Computing for bioinformatics.
BLAST Anders Gorm Pedersen & Rasmus Wernersson. Database searching Using pairwise alignments to search databases for similar sequences Database Query.
Sequence comparison: Dynamic programming Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
Sequence Alignment.
Doug Raiford Phage class: introduction to sequence databases.
Genome Revolution: COMPSCI 004G 8.1 BLAST l What is BLAST? What is it good for?  Basic.
Introduction to Sequence Alignment. Why Align Sequences? Find homology within the same species Find clues to gene function Practical issues in experiments.
Pairwise sequence comparison
Sequence comparison: Dynamic programming
Welcome to Introduction to Bioinformatics
Sequence comparison: Local alignment
Sequence comparison: Significance of similarity scores
Sequence comparison: Traceback and local alignment
Motif p-values GENOME 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Sequence comparison: Multiple testing correction
Global, local, repeated and overlaping
Sequence comparison: Dynamic programming
Pairwise sequence Alignment.
#7 Still more DP, Scoring Matrices
Pairwise Sequence Alignment
While loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Sequence comparison: Local alignment
Sequence comparison: Traceback
BCB 444/544 Lecture 7 #7_Sept5 Global vs Local Alignment
Sequence comparison: Multiple testing correction
Sequence Alignment Algorithms Morten Nielsen BioSys, DTU
Sequence comparison: Significance of similarity scores
False discovery rate estimation
Sequence alignment, E-value & Extreme value distribution
Presentation transcript:

Sequence comparison: Local alignment Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble

One-minute responses It would be helpful to somehow get the solutions for the sample problems in our lecture printouts. –You can see the solutions by visiting the class web page and opening the slides there. I am still a little confused about the difference between strings and lists. –A string is like a tuple of characters. Unlike a string a list is (1) mutable and (2) may contain other objects besides characters. The Biotechniques paper went into a lot of detail -- how much of this should we understand? –I intend the paper to provide background for those who are interested. You should be sure you understand just what I go over in lecture. I am slightly worried because I never seem to do things in the most straightforward way. –This just takes practice. Often, there is no single best way.

One-minute responses There was perhaps a bit too much programming in this class. There was more class time for Python, which was nice. I really liked the sample problem times. Problem set is very reasonable. The examples and practice are most useful teaching methods for me at least. I am getting comfortable with the code through practice. I like the sample problems. In the last few classes I felt rushed to finish them, but this time I was able to do all 3. It's very satisfying when they work. I had somewhat more difficulty with today's exercises. I think it was due to the inherent complexity of adding new types to the repertoire. Class moved at a good speed today. I enjoyed the pace today. Today's pace was good. The pace was good -- it was helpful for me to have more time for problems. Good pace. Programming problems were a good speed today. The biostats portion was a little fast but manageable.

One-minute responses The cheat sheet really helped. I really liked the list of operations and methods on the back of the lecture notes. Lists of commands in slides were helpful. Reviewing the DP matrix was very helpful. I'm glad we reviewed the Needleman- Wunsch algorithm. The traceback review helped me realize I'd forgotten how to do it.

Local alignment A single-domain protein may be homologous to a region within a multi-domain protein. Usually, an alignment that spans the complete length of both sequences is not required.

BLAST allows local alignments Global alignment Local alignment

Global alignment DP Align sequence x and y. F is the DP matrix; s is the substitution matrix; d is the linear gap penalty.

Local alignment DP Align sequence x and y. F is the DP matrix; s is the substitution matrix; d is the linear gap penalty.

Local DP in equation form 0

A simple example ACGT A C 2 -5 G -72 T AAG A G C Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. 0

A simple example ACGT A C 2 -5 G -72 T AAG 0000 A0 G0 C0 Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. 0

A simple example ACGT A C 2 -5 G -72 T AAG 0000 A0 G0 C0 Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=

A simple example ACGT A C 2 -5 G -72 T AAG 0000 A02 G0? C0? Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. 0

A simple example ACGT A C 2 -5 G -72 T AAG 0000 A02?? G00?? C00?? Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. 0

A simple example ACGT A C 2 -5 G -72 T AAG 0000 A0220 G0004 C0000 Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. 0

Local alignment Two differences with respect to global alignment: –No score is negative. –Traceback begins at the highest score in the matrix and continues until you reach 0. Global alignment algorithm: Needleman- Wunsch. Local alignment algorithm: Smith- Waterman.

A simple example ACGT A C 2 -5 G -72 T AAG 0000 A0220 G0004 C0000 Find the optimal local alignment of AAG and AGC. Use a gap penalty of d=-5. 0 AG

Local alignment ACGT A C 2 -5 G -72 T AAG 0000 G0 A0 A0 G0 G0 C0 Find the optimal local alignment of AAG and GAAGGC. Use a gap penalty of d=-5. 0

Local alignment ACGT A C 2 -5 G -72 T AAG 0000 G0002 A0220 A0240 G0006 G0002 C0000 Find the optimal local alignment of AAG and GAAGGC. Use a gap penalty of d=-5. 0

Local alignment ACGT A C 2 -5 G -72 T AAG 0000 G0002 A0220 A0240 G0006 G0002 C0000 Find the optimal local alignment of AAG and GAAGGC. Use a gap penalty of d=-5. 0 AAG

Summary Local alignment finds the best match between subsequences. Smith-Waterman local alignment algorithm: –No score is negative. –Trace back from the largest score in the matrix.