Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aligning Sequences With T-Coffee Cédric Notredame Comparative Bioinformatics Group Bioinformatics and Genomics Program.

Similar presentations


Presentation on theme: "Aligning Sequences With T-Coffee Cédric Notredame Comparative Bioinformatics Group Bioinformatics and Genomics Program."— Presentation transcript:

1 Aligning Sequences With T-Coffee Cédric Notredame Comparative Bioinformatics Group Bioinformatics and Genomics Program

2 T-Coffee and Concistency… SeqA GARFIELD THE LAST FAT CAT SeqB GARFIELD THE FAST CAT SeqC GARFIELD THE VERY FAST CAT SeqD THE FAT CAT SeqA GARFIELD THE LAST FA-T CAT SeqB GARFIELD THE FAST CA-T --- SeqC GARFIELD THE VERY FAST CAT SeqD -------- THE ---- FA-T CAT

3 Consistency: Conflicts and Information Y WZ X Z Y Z W Y Z X W X Y OR + + Non Consistent Consistent Y WZ Y Z W OR X X X

4 T-Coffee and Concistency… SeqA GARFIELD THE LAST FAT CAT Prim. Weight =88 SeqB GARFIELD THE FAST CAT --- SeqA GARFIELD THE LAST FA-T CAT Prim. Weight =77 SeqC GARFIELD THE VERY FAST CAT SeqA GARFIELD THE LAST FAT CAT Prim. Weight =100 SeqD -------- THE ---- FAT CAT SeqB GARFIELD THE ---- FAST CAT Prim. Weight =100 SeqC GARFIELD THE VERY FAST CAT SeqC GARFIELD THE VERY FAST CAT Prim. Weight =100 SeqD -------- THE ---- FA-T CAT

5 T-Coffee and Concistency… SeqA GARFIELD THE LAST FAT CAT Prim. Weight =88 SeqB GARFIELD THE FAST CAT --- SeqA GARFIELD THE LAST FA-T CAT Prim. Weight =77 SeqC GARFIELD THE VERY FAST CAT SeqA GARFIELD THE LAST FAT CAT Prim. Weight =100 SeqD -------- THE ---- FAT CAT SeqB GARFIELD THE ---- FAST CAT Prim. Weight =100 SeqC GARFIELD THE VERY FAST CAT SeqC GARFIELD THE VERY FAST CAT Prim. Weight =100 SeqD -------- THE ---- FA-T CAT SeqA GARFIELD THE LAST FAT CAT Weight =88 SeqB GARFIELD THE FAST CAT --- SeqA GARFIELD THE LAST FA-T CAT Weight =77 SeqC GARFIELD THE VERY FAST CAT SeqB GARFIELD THE ---- FAST CAT SeqA GARFIELD THE LAST FA-T CAT Weight =100 SeqD -------- THE ---- FA-T CAT SeqB GARFIELD THE ---- FAST CAT

6 T-Coffee and Concistency… SeqA GARFIELD THE LAST FAT CAT Weight =88 SeqB GARFIELD THE FAST CAT --- SeqA GARFIELD THE LAST FA-T CAT Weight =77 SeqC GARFIELD THE VERY FAST CAT SeqB GARFIELD THE ---- FAST CAT SeqA GARFIELD THE LAST FA-T CAT Weight =100 SeqD -------- THE ---- FA-T CAT SeqB GARFIELD THE ---- FAST CAT

7 T-Coffee and Concistency…

8

9

10 Methods Data Scalability

11 Running T-Coffee over the Web

12 Available Servers and Flavors

13 Which MSA Method ???

14 Combining Many MSAs into ONE MUSCLE MAFFT ClustalW ??????? T-Coffee

15 Consistency and Accuracy

16 What To Do Without Structures

17 Using the M-Coffee Server

18

19

20 Integrating New Types of Data Template Based Sequence Alignments

21 Experimental Data … TARGET Experimental Data … TARGET Template Aligner Template-Sequence Alignment Primary Library Template Alignment Template based Alignment of the Sequences Templates TARGET

22 Exploring The Template World TemplateGeneratorAlignment Method RNA StructurePredictionRNA Aligner Protein StructureBLAST vs PDB3D Aligner ProfileBLAST vs NRProfile/Profile Alignment Gene StructureENSEMBLGenome Aligner PromoterTransfacMeta-Aligner

23 Exploring The Template World TemplateGeneratorAlignment Method Mode RNA Structure PredictionRNA Aligner R-Coffee Protein Structure BLAST /PDB3D Aligner 3D-Coffee Profile BLAST/NRProfile/Profile PSI-Coffee Gene Structure ENSEMBLGenome Aligner Exoset Promoter TransfacMeta-Aligner Meta-Coffee

24 3D-Coffee/Expresso Incorporating Structural Information

25 Expresso: Finding the Right Structure Sources Templates Library BLAST SAP Template Alignment Source Template Alignment Remove Templates Templates

26 PSI-Coffee Homology Extension

27 Exploring The Template World

28 What is Homology Extension ? LL L ? -Simple scoring schemes result in alignment ambiguities

29 What is Homology Extension ? LL L L L L L L L L L I V I L L L L L L L Profile 1 Profile 2

30 What is Homology Extension ? LL L L L L L L L L L I V I L L L L L L L Profile 1 Profile 2

31 PSI-Coffee: Homology Extension Sources Templates Library BLAST Template Alignment Source Template Alignment Remove Templates Templates Profile Aligner

32 Benchmarks

33 Do Benchmarks All Tell the same story? Based on

34 Method TemplateScoreComment ClustalW-2ProgressiveNO22.74 PRANKGapNO26.18Science2008 MAFFTIterativeNO26.18 MuscleIterativeNO31.37 ProbConsConsistencyNO40.80 ProbConsMonoPhasicNO37.53 T-CoffeeConsistencyNO42.30 M-Coffe4ConsistencyNO43.60 PSI-CoffeeConsistencyProfile53.71 PROMALConsistencyProfile55.08 PROMAL-3DConsistencyPDB57.60 3D-CoffeeConsistencyPDB61.00Expresso Score: fraction of correct columns when compared with a structure based reference (BB11 of BaliBase).

35 Method TemplateScoreComment ClustalW-2ProgressiveNO22.74 PRANKGapNO26.18Science2008 MAFFTIterativeNO26.18 MuscleIterativeNO31.37 ProbConsConsistencyNO40.80 ProbConsMonoPhasicNO37.53 T-CoffeeConsistencyNO42.30 M-Coffe4ConsistencyNO43.60 PSI-CoffeeConsistencyProfile53.71 PROMALConsistencyProfile55.08 PROMAL-3DConsistencyPDB57.60 3D-CoffeeConsistencyPDB61.00Expresso Score: fraction of correct columns when compared with a structure based reference (BB11 of BaliBase). Consistency

36 Method TemplateScoreComment ClustalW-2ProgressiveNO22.74 PRANKGapNO26.18Science2008 MAFFTIterativeNO26.18 MuscleIterativeNO31.37 ProbConsConsistencyNO40.80 ProbConsMonoPhasicNO37.53 T-CoffeeConsistencyNO42.30 M-Coffe4ConsistencyNO43.60 PSI-CoffeeConsistencyProfile53.71 PROMALConsistencyProfile55.08 PROMAL-3DConsistencyPDB57.60 3D-CoffeeConsistencyPDB61.00Expresso Score: fraction of correct columns when compared with a structure based reference (BB11 of BaliBase). Homology Extension

37 Method TemplateScoreComment ClustalW-2ProgressiveNO22.74 PRANKGapNO26.18Science2008 MAFFTIterativeNO26.18 MuscleIterativeNO31.37 ProbConsConsistencyNO40.80 ProbConsMonoPhasicNO37.53 T-CoffeeConsistencyNO42.30 M-Coffe4ConsistencyNO43.60 PSI-CoffeeConsistencyProfile53.71 PROMALConsistencyProfile55.08 PROMAL-3DConsistencyPDB57.60 3D-CoffeeConsistencyPDB61.00Expresso Score: fraction of correct columns when compared with a structure based reference (BB11 of BaliBase). Structural Extension

38 T-Coffee and The World BLAST/ SOAP -Some Templates are obtained with a BLAST -Queries can be sent to the EBI or the NCBI -No Need for a Local BLAST installation Users sequences


Download ppt "Aligning Sequences With T-Coffee Cédric Notredame Comparative Bioinformatics Group Bioinformatics and Genomics Program."

Similar presentations


Ads by Google