Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009

Similar presentations


Presentation on theme: "1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009"— Presentation transcript:

1 1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009 http://ibis.tau.ac.il/twiki/bin/view/Bioinformatics/Phylogeny2009

2 2 Why should we care about phylogeny? "Nothing in biology makes sense except in the light of evolution" (Theodosius Dobzhansky, 1973)

3 33 Alignment and phylogeny are mutually dependant Inaccurate tree building MSA Sequence alignment Phylogeny reconstruction Unaligned sequences

4 44 Alignment and phylogeny are both challenging 25% of residues are aligned wrong Based on BAliBASE: a large representative set of proteins

5 55 Alignment and phylogeny are both challenging 5% of tree branches are wrong Based on simulations of 100 protein sequences

6 66 Multiple sequence alignment (MSA) progressive alignment ABCDEABCDE Guide tree A D C B E MSA Pairwise distance table Iterative

7 77 Multiple sequence alignment (MSA) Several advanced MSA programs are available. Today we will use two: MAFFT – fastest and one of the most accurate PRANK – distinct from all other MSA programs because of its correct treatment of insertions/deletions

8 88 MAFFT Web server & download: http://align.bmr.kyushu-u.ac.jp/mafft/online/server/ http://align.bmr.kyushu-u.ac.jp/mafft/online/server/ Efficiency-tuned variants  quick & dirty or slow but accurate Nucleic Acids Research, 2002, Vol. 30, No. 14 3059-3066 © 2002 Oxford University PressOxford University Press MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform Kazutaka Katoh, Kazuharu Misawa 1, Kei-ichi Kuma and Takashi Miyata *

9 99 Choosing a MAFFT strategy quick & dirty slow but accurate

10 10 Choosing a MAFFT strategy quick & dirty slow but accurate

11 11 Choosing a MAFFT strategy quick & dirty slow but accurate

12 12 Choosing a MAFFT strategy L-INS-i ooooooooooooooooooooooooooooooooXXXXXXXXXXX-XXXXXXXXXXXXXXX------------------ --------------------------------XX-XXXXXXXXXXXXXXX-XXXXXXXXooooooooooo------- ------------------ooooooooooooooXXXXX----XXXXXXXX---XXXXXXXooooooooooo------- --------ooooooooooooooooooooooooXXXXX-XXXXXXXXXX----XXXXXXXoooooooooooooooooo --------------------------------XXXXXXXXXXXXXXXX----XXXXXXX------------------ G-INS-i XXXXXXXXXXX-XXXXXXXXXXXXXXX XX-XXXXXXXXXXXXXXX-XXXXXXXX XXXXX----XXXXXXXX---XXXXXXX XXXXX-XXXXXXXXXX----XXXXXXX XXXXXXXXXXXXXXXX----XXXXXXX E-INS-i oooooooooXXX------XXXX---------------------------------XXXXXXXXXXX-XXXXXXXXXXXXXXXooooooooooooo ---------XXXXXXXXXXXXXooo------------------------------XXXXXXXXXXXXXXXXXX-XXXXXXXX------------- -----ooooXXXXXX---XXXXooooooooooo----------------------XXXXX----XXXXXXXXXXXXXXXXXXooooooooooooo ---------XXXXX----XXXXoooooooooooooooooooooooooooooooooXXXXX-XXXXXXXXXXXX--XXXXXXX------------- ---------XXXXX----XXXX---------------------------------XXXXX---XXXXXXXXXX--XXXXXXXooooo-------- quick & dirty slow but accurate

13 13 MAFFT output Saving the output Choose a format: Clustal, Fasta, or click "Reformat" to convert to a selection of other formats Save page as a text file A colored view of the alignment

14 14 PRANK

15 15 Classical alignment errors for HIV env

16 16 PRANK Web server: http://www.ebi.ac.uk/goldman-srv/webPRANK/http://www.ebi.ac.uk/goldman-srv/webPRANK/

17 17 PRANK output If you need a different format – copy the results to the READSEQ sequence converter: http://www-bimas.cit.nih.gov/molbio/readseq/ http://www-bimas.cit.nih.gov/molbio/readseq/

18 18 Downloadable PRANK http://www.ebi.ac.uk/goldman-srv/prank/prank/ –PRANK: A command-line program interface –PRANKSTER: A program with graphical user interface

19 19 1.Download and unzip the sequence files from my homepage (Google "Eyal Privman" and look for the workshop materials under "Teaching"). Open "fahA.fas" in Notepad – these are 65 protein sequences in FASTA format. 2.Run PRANKSTER, open the "fahA.fas" file, and run "Alignment"  "Make alignment" 3.While you wait: Copy the sequences into the MAFFT web server and run the "automatic" "moderate" strategy – which strategy did MAFFT choose for you? Click "Reformat", choose "phylip|phylip4", and save as "fahA.mafft.phylip" 4.When PRANKSTER finishes click File  Save, and save the MSA in Phylip format by the name "fahA.prank.phylip"

20 20 Phylogeny reconstruction Different approaches (algorithms / programs): Distance based methods (e.g. neighbor-joining, as in ClustalW)  Fast but inaccurate Maximum parsimony (e.g. MEGA)MEGA Maximum likelihood methods (e.g. phyML, RAxML)  Accurate but slowerphyMLRAxML Bayesian methods (e.g. MrBayes)  Most accurate but very slowMrBayes ABCDEABCDE Guide tree A D C B E MSA Pairwise distance table

21 21 PhyML The most widely used maximum likelihood (ML) program Web server & download: http://www.atgc-montpellier.fr/phyml/http://www.atgc-montpellier.fr/phyml/ Accepts input MSA in PHYLIP format only: Interleaved: Sequencial:

22 22 Downloadable PhyML Less user-friendly, but allows using local computer power Run "phyml.bat" Drag the file from Windows Explorer to the blue window Enter "d" to switch from DNA to AA Enter "y" to run

23 23 1.Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the phyML webserver (don't forget to choose "Amino-acids" and enter your email) 2.Run it with the local installation of "phyml.bat" You should end up with a file: "fahA.prank.phylip_phyml_tree.txt"

24 24 RAxML Web server: http://phylobench.vital-it.ch/raxml-bb/http://phylobench.vital-it.ch/raxml-bb/ Similar maximum likelihood (ML) methodology as phyML, but much faster  Faster results  Better results in same run-time

25 25 Downloadable RAxML A command-line program: http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htm (On that page you will also find instructions for running on Windows, and the RAxML manual) http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htminstructionsmanual easyRAx takes care of some of the RAxML options for you: http://projects.exeter.ac.uk/cee m/easyRAx.html but installation is a somewhat more complex http://projects.exeter.ac.uk/cee m/easyRAx.html

26 26 1.Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the RAxML webserver (don't forget to tick "Protein sequences" and enter your email) Save the resulting tree file as: "fahA.prank.phylip.raxml"

27 27 FigTree: tree visualization and figure creation Manipulate a node Manipulate a clade Manipulate a taxon

28 28 1.Open "fahA.prank.phylip_phyml_tree.txt" in FigTree 2.Play around with the different options and make a pretty figure! 1.Find out how to color specific clades, as below 2.Try each of the three options under "Layout" 3.Export a figure in PDF format (File  Export Graphic … )

29 29 Thanks for your attention and happy phylogeny …


Download ppt "1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009"

Similar presentations


Ads by Google