Presentation is loading. Please wait.

Presentation is loading. Please wait.

. Class 5: Multiple Sequence Alignment. Multiple sequence alignment VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG.

Similar presentations


Presentation on theme: ". Class 5: Multiple Sequence Alignment. Multiple sequence alignment VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG."— Presentation transcript:

1 . Class 5: Multiple Sequence Alignment

2 Multiple sequence alignment VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWESNG-- Homologous residues are aligned together in columns  Homologous - in the structural and evolutionary sense Ideally, a column of aligned residues occupy similar 3d structural positions

3 Multiple alignment – why? u Identify sequence that belongs to a family  Family – a collection of homologous, with similar sequence, 3d structure, function or evolutionary history u Find features that are conserved in the whole family  Highly conserved regions, core structural elements

4 The relation between the divergence of sequence and structure [Durbin p. 137, redrawn from data in Chothia and Lesk (1986)]

5 Scoring a multiple alignment (1) Important features of multiple alignment: uSuSome positions are more conserved than others  Position specific scoring uSuSequences are not independent (related by phylogenetic tree) Ideally, specify a complete model of molecular sequence evolution

6 Scoring a multiple alignment (2) Unfortunately, not enough data … Assumption (1) Columns of alignment are statistically independent.

7 Minimum entropy Assumption (2) Symbols within columns are independent Entropy measure

8 Sum of pairs (SP) Columns are scored by a “sum of pairs” function, using a substitution scoring matrix Note:

9 Multidimensional DP

10

11 Complexity Space: Time:

12 Pairwise projections of MA

13 MSA (i) [Carrillo and Lipman, 1988]

14 MSA (ii)

15 MSA (iii) Algorithm sketch

16 Progressive alignment methods (i) Basic idea: construct a succession of PW alignments Variatoins: u PW alignment order u One growing alignment or subfamilies u Alignment and scoring procedure

17 Progressive alignment methods (ii) Most important heuristic – align the most similar pairs first. Many algorithms build a “guide tree”: u Leaves – sequence u Interior nodes – alignments u Root – complete multiple alignment

18 Feng-Doolittle (1987) u Calculate all pairwise distances using alignment scores: u Construct a guide tree using hierarchical clustering u Highest scoring pairwise alignment determines sequence to group alignment

19 Profile alignment u Use profiles for group to sequence and group to group alignments  CLUSTALW (Thompson et al., 1994):  Similar to Feng-Doolittle, but uses profile alignment methods  Numerous heuristics

20 Iterative Refinement u Addresses “frozen” sub-alignment problem u Iteratively realign sequences or groups to a profile of the rest u Barton and Sternberg (1987)  Align two most similar sequences  Align current profile to most similar sequence  Remove each sequence and align it to profile


Download ppt ". Class 5: Multiple Sequence Alignment. Multiple sequence alignment VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG."

Similar presentations


Ads by Google