. Class 5: Multiple Sequence Alignment. Multiple sequence alignment VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG.

Slides:



Advertisements
Similar presentations
Multiple Alignment Anders Gorm Pedersen Molecular Evolution Group
Advertisements

Multiple alignment: heuristics. Consider aligning the following 4 protein sequences S1 = AQPILLLV S2 = ALRLL S3 = AKILLL S4 = CPPVLILV Next consider the.
BNFO 602 Multiple sequence alignment Usman Roshan.
1 “INTRODUCTION TO BIOINFORMATICS” “SPRING 2005” “Dr. N AYDIN” Lecture 4 Multiple Sequence Alignment Doç. Dr. Nizamettin AYDIN
Sequence analysis lecture 6 Sequence analysis course Lecture 6 Multiple sequence alignment 2 of 3 Multiple alignment methods.
1 Multiple sequence alignment Lesson 4. 2 VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG.
Multiple sequence alignment Conserved blocks are recognized Different degrees of similarity are marked.
Multiple Sequence Alignment Algorithms in Computational Biology Spring 2006 Most of the slides were created by Dan Geiger and Ydo Wexler and edited by.
Multiple alignment June 29, 2007 Learning objectives- Review sequence alignment answer and answer questions you may have. Understand how the E value may.
11 Ch6 multiple sequence alignment methods 1 Biologists produce high quality multiple sequence alignment by hand using knowledge of protein sequence evolution.
What you should know by now Concepts: Pairwise alignment Global, semi-global and local alignment Dynamic programming Sequence similarity (Sum-of-Pairs)
Identifying functional residues of proteins from sequence info Using MSA (multiple sequence alignment) - search for remote homologs using HMMs or profiles.
Multiple Alignment. Outline Problem definition Can we use Dynamic Programming to solve MSA? Progressive Alignment ClustalW Scoring Multiple Alignments.
Performance Optimization of Clustal W: Parallel Clustal W, HT Clustal and MULTICLUSTAL Arunesh Mishra CMSC 838 Presentation Authors : Dmitri Mikhailov,
Multiple alignment: heuristics
Multiple sequence alignment
BNFO 602 Multiple sequence alignment Usman Roshan.
Multiple sequence alignment Conserved blocks are recognized Different degrees of similarity are marked.
Practical multiple sequence algorithms Sushmita Roy BMI/CS 576 Sushmita Roy Sep 23rd, 2014.
Multiple Sequence Alignment
Multiple Sequence Alignments
Multiple sequence alignment methods 1 Corné Hoogendoorn Denis Miretskiy.
CECS Introduction to Bioinformatics University of Louisville Spring 2003 Dr. Eric Rouchka Lecture 3: Multiple Sequence Alignment Eric C. Rouchka,
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Lecture 6 – 16/11/06 Multiple sequence alignment 1 Sequence analysis 2006 Multiple.
CISC667, F05, Lec8, Liao CISC 667 Intro to Bioinformatics (Fall 2005) Multiple Sequence Alignment Scoring Dynamic Programming algorithms Heuristic algorithms.
Introduction to Bioinformatics From Pairwise to Multiple Alignment.
MCB 5472 Lecture #6: Sequence alignment March 27, 2014.
Chapter 5 Multiple Sequence Alignment.
Multiple Sequence Alignment BMI/CS 576 Colin Dewey Fall 2010.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Multiple Alignment Modified from Tolga Can’s lecture notes (METU)
Multiple Sequence Alignment
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Practical multiple sequence algorithms Sushmita Roy BMI/CS 576 Sushmita Roy Sep 24th, 2013.
Multiple Sequence Alignment May 12, 2009 Announcements Quiz #2 return (average 30) Hand in homework #7 Learning objectives-Understand ClustalW Homework#8-Due.
ZORRO : A masking program for incorporating Alignment Accuracy in Phylogenetic Inference Sourav Chatterji Martin Wu.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Eric C. Rouchka, University of Louisville SATCHMO: sequence alignment and tree construction using hidden Markov models Edgar, R.C. and Sjolander, K. Bioinformatics.
Eidhammer et al. Protein Bioinformatics Chapter 4 1 Multiple Global Sequence Alignment and Phylogenetic trees Inge Jonassen and Ingvar Eidhammer.
Phylogenetic Trees Tutorial 5. Agenda How to construct a tree using Neighbor Joining algorithm Phylogeny.fr tool Cool story of the day: Horizontal gene.
Multiple Sequence Alignment
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Bioinformatics Multiple Alignment. Overview Introduction Multiple Alignments Global multiple alignment –Introduction –Scoring –Algorithms.
Multiple alignment: Feng- Doolittle algorithm. Why multiple alignments? Alignment of more than two sequences Usually gives better information about conserved.
Multiple Sequence Alignment. How to score a MSA? Very commonly: Sum of Pairs = SP Compute the pairwise score of all pairs of sequences and sum them. Gap.
Multiple sequence alignment Dr Alexei Drummond Department of Computer Science Semester 2, 2006.
1 Lecture 8 Chapter 6 Multiple Sequence Alignment Methods.
Multiple sequence comparison (MSC) Reading: Setubal/Meidanis, 3.4 Gusfield, Algorithms on Strings, Trees and Sequences, chapter 14.
Multiple Sequence Alignment Colin Dewey BMI/CS 576 Fall 2015.
Tutorial 5 Phylogenetic Trees.
Sequence Alignment Abhishek Niroula Department of Experimental Medical Science Lund University
Multiple Sequence Alignment
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Multiple Sequence Alignment (cont.) (Lecture for CS397-CXZ Algorithms in Bioinformatics) Feb. 13, 2004 ChengXiang Zhai Department of Computer Science University.
More on HMMs and Multiple Sequence Alignment BMI/CS 776 Mark Craven March 2002.
1 Multiple Sequence Alignment and Molecular Evolution.
BIOINFORMATICS Ayesha M. Khan Spring Lec-6.
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
Multiple Sequence Alignment Dr. Urmila Kulkarni-Kale Bioinformatics Centre University of Pune
Chapter 6. Multiple sequence alignment methods
Multiple sequence alignment (msa)
The ideal approach is simultaneous alignment and tree estimation.
Overview of Multiple Sequence Alignment Algorithms
A Hybrid Algorithm for Multiple DNA Sequence Alignment
Multiple Sequence Alignment
Multiple Sequence Alignment (I)
Introduction to Bioinformatics
Computational Genomics Lecture #3a
Presentation transcript:

. Class 5: Multiple Sequence Alignment

Multiple sequence alignment VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWESNG-- Homologous residues are aligned together in columns  Homologous - in the structural and evolutionary sense Ideally, a column of aligned residues occupy similar 3d structural positions

Multiple alignment – why? u Identify sequence that belongs to a family  Family – a collection of homologous, with similar sequence, 3d structure, function or evolutionary history u Find features that are conserved in the whole family  Highly conserved regions, core structural elements

The relation between the divergence of sequence and structure [Durbin p. 137, redrawn from data in Chothia and Lesk (1986)]

Scoring a multiple alignment (1) Important features of multiple alignment: uSuSome positions are more conserved than others  Position specific scoring uSuSequences are not independent (related by phylogenetic tree) Ideally, specify a complete model of molecular sequence evolution

Scoring a multiple alignment (2) Unfortunately, not enough data … Assumption (1) Columns of alignment are statistically independent.

Minimum entropy Assumption (2) Symbols within columns are independent Entropy measure

Sum of pairs (SP) Columns are scored by a “sum of pairs” function, using a substitution scoring matrix Note:

Multidimensional DP

Complexity Space: Time:

Pairwise projections of MA

MSA (i) [Carrillo and Lipman, 1988]

MSA (ii)

MSA (iii) Algorithm sketch

Progressive alignment methods (i) Basic idea: construct a succession of PW alignments Variatoins: u PW alignment order u One growing alignment or subfamilies u Alignment and scoring procedure

Progressive alignment methods (ii) Most important heuristic – align the most similar pairs first. Many algorithms build a “guide tree”: u Leaves – sequence u Interior nodes – alignments u Root – complete multiple alignment

Feng-Doolittle (1987) u Calculate all pairwise distances using alignment scores: u Construct a guide tree using hierarchical clustering u Highest scoring pairwise alignment determines sequence to group alignment

Profile alignment u Use profiles for group to sequence and group to group alignments  CLUSTALW (Thompson et al., 1994):  Similar to Feng-Doolittle, but uses profile alignment methods  Numerous heuristics

Iterative Refinement u Addresses “frozen” sub-alignment problem u Iteratively realign sequences or groups to a profile of the rest u Barton and Sternberg (1987)  Align two most similar sequences  Align current profile to most similar sequence  Remove each sequence and align it to profile