Presentation is loading. Please wait.

Presentation is loading. Please wait.

MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics.

Similar presentations


Presentation on theme: "MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics."— Presentation transcript:

1 MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

2 Back to TaxPlot Please hand in your final summary of your tax plot dataPlease hand in your final summary of your tax plot data Peg will look over it and provide a final summary of what the class data looks like and send it to you by emailPeg will look over it and provide a final summary of what the class data looks like and send it to you by email

3 Back to Alignments Your final alignments are due in class todayYour final alignments are due in class today –Please make sure that Peg or Michelle has checked them and given you an okay –Hand print outs to Peg or Michelle for their records

4 On to Tree Building Today you will get your last tool - you will learn how to create phylogenetic treesToday you will get your last tool - you will learn how to create phylogenetic trees We will start with an introduction to tree- building and then end with some lab exercises to give you some practice building treesWe will start with an introduction to tree- building and then end with some lab exercises to give you some practice building trees Your homework for next Tuesday is to create and print out a tree for each of your proteins - you will make 6 trees in total and each tree will have 6 taxa (3 from one species and 3 from the second species)Your homework for next Tuesday is to create and print out a tree for each of your proteins - you will make 6 trees in total and each tree will have 6 taxa (3 from one species and 3 from the second species)

5 Creating protein-based phylogenies The last step in our analysis is to create molecular phylogenies from each of our alignmentsThe last step in our analysis is to create molecular phylogenies from each of our alignments –We will use these data, combined with the matrixes of similarity for each protein, to determine if all of our proteins have evolved in a similar, vertical, fashion To begin, you need to learn a bit about phylogeneticsTo begin, you need to learn a bit about phylogenetics

6 Phylogenetic inference is premised on – –The inheritance of ancestral characters – –The existence of an evolutionary history defined by changes in these characters Results in a tree-like model of evolution – –Except when there is paralogy or lateral transfer Olsen, 2006

7 Character evolution Heritable changes in featuresHeritable changes in features (morphology, gene sequences, (morphology, gene sequences, etc.) provide the basis for inferring phylogenies etc.) provide the basis for inferring phylogenies –Such changes are usually referred to as the states of characters (presence or absence, nucleotide at a specific site, etc.) –Their utility depends on how often the changes that produce different character states occur independently (homoplasy)

8 Unique & Un-reversed Characters Given a heritable evolutionary change that is unique and un-reversed (e.g. the origin of hair), the presence of the novelty in any taxa must be due to inheritance from the ancestorGiven a heritable evolutionary change that is unique and un-reversed (e.g. the origin of hair), the presence of the novelty in any taxa must be due to inheritance from the ancestor –Similiarly, absence in any taxa must be because the taxa are not descendants of that ancestor The novelty will be a homology acting as a marker for the descendants of the ancestorThe novelty will be a homology acting as a marker for the descendants of the ancestor –The taxa with the novelty will be a clade (eg. Mammalia)

9 Hair Because hair evolved only once and is un-reversed it is homologous and provides unambiguous evidence for the clade Mammalia Lizard Frog Human Dog Change in state Hair present absent

10 Homoplasy Homoplasy is similarity that is not homologous, i.e. not due to inheritance from a common ancestorHomoplasy is similarity that is not homologous, i.e. not due to inheritance from a common ancestor It is the result of independent evolution (convergence, parallelism, reversal)It is the result of independent evolution (convergence, parallelism, reversal) Homoplasy can provide misleading evidence of phylogenetic relationshipsHomoplasy can provide misleading evidence of phylogenetic relationships

11 Homoplasy A Fundamental Problem with Phylogenetic Inference If there were no homoplastic similarities, inferring phylogenies would be easy - all the pieces of the jig-saw would fit together neatlyIf there were no homoplastic similarities, inferring phylogenies would be easy - all the pieces of the jig-saw would fit together neatly

12 Incongruence or Incompatibility Lizard Frog Human Dog Lizard Frog Human Dog Hair Tail These trees and characters are incongruent Both trees cannot be correct and at least one character must be homoplastic

13 Distinguishing Homology and Homoplasy Morphologists use a variety of techniques to distinguish homoplasy and homologyMorphologists use a variety of techniques to distinguish homoplasy and homology –Homologous characters are expected to display detailed similarity, in position, structure and development –Homoplastic similarities are more likely to be superficial –As recognized by Darwin, congruence with other characters provides the most compelling evidence for homology

14 The Importance of Congruence “The importance for classification of trifling characters, mainly depends on their being correlated with several other characters of more or less importance. The value indeed of an aggregate of characters is very evident…..a classification founded on any single character, however important that may be, has always failed.” Charles Darwin, Origin of Species

15 Homoplasy & Molecular Data Incongruence and therefore homoplasy can be common in molecular dataIncongruence and therefore homoplasy can be common in molecular data –One reason is that characters have a limited number of alternative states (e.g. A, G, C, T) –In addition, these states are chemically identical, so that homology and homoplasy are equally similar and cannot be distinguished through detailed study of structure or development

16 Homology & Homoplasy Each nucleotide position can be considered homologousEach nucleotide position can be considered homologous –Although in some taxa the position is not present because of an insertion or deletion Example: The phylogeny of the four taxa is knownExample: The phylogeny of the four taxa is known –The two trees illustrate character state homology (for character 2) and homoplasy (for character 4) –Note that the sequence alignment consists of characters in which a nucleotide is missing because of an insertion or deletion mutation

17 Homology & Homoplasy This example illustrates that definitions of homology will be contingent on the choice of taxa being comparedThis example illustrates that definitions of homology will be contingent on the choice of taxa being compared –For instance, if species C was not included, character 6 would not exist Examples of homologous charactersExamples of homologous characters –In addition to nucleotides and amino acids, many other character states can be considered homologous –For instance, the presence of absence of an intron, a transposable element, an insertion or deletion

18 Phylogenetic trees A phylogenetic tree is a statement about the evolutionary relationship between a set of homologous characters of one or several organismsA phylogenetic tree is a statement about the evolutionary relationship between a set of homologous characters of one or several organisms It is composed of lines called branches that intersect and terminate at nodesIt is composed of lines called branches that intersect and terminate at nodes –The nodes at the tips of the branches represent taxa, on in the case of sequence data, the sequences, that exist today –The internal nodes represent ancestral taxa, whose properties we can only infer from existing taxa

19 Unrooted trees For 4 taxa there are only three possible unrooted trees A A A B B B C C C D D D

20 Rooted tree A tree is rooted if there is a particular node, the root, from which a unique directional path leads to each extant taxonA tree is rooted if there is a particular node, the root, from which a unique directional path leads to each extant taxon In this tree, the root is the only internal node from which all other nodes can by reached by moving forward, toward the tipsIn this tree, the root is the only internal node from which all other nodes can by reached by moving forward, toward the tips The root is the common ancestor of all the taxa in the treeThe root is the common ancestor of all the taxa in the tree A B C D E R X Y Z

21 Rooted trees Once a root is identified, 5 different rooted trees can be created for each of the three unrooted trees, each with a distinctive branching pattern reflecting a different evolutionary history for the relationships shown in the unrooted trees…here are a fewOnce a root is identified, 5 different rooted trees can be created for each of the three unrooted trees, each with a distinctive branching pattern reflecting a different evolutionary history for the relationships shown in the unrooted trees…here are a few A B C D A B C D A B C D

22 Phylogenetic tree A B C D E This is a rooted tree whose branch tips represent 5 taxa (A-E) This is a rooted tree whose branch tips represent 5 taxa (A-E) The numbers on the branches indicate changes in a sequence that occurred along that branch The numbers on the branches indicate changes in a sequence that occurred along that branch e.g. between X and Y 3 changes occurred and between Y and D 1 change occurred e.g. between X and Y 3 changes occurred and between Y and D 1 change occurred This tree is additive because the distance between any two nodes equals the sum of the lengths of the all the branches between them This tree is additive because the distance between any two nodes equals the sum of the lengths of the all the branches between them A node is bifurcating if it has only two immediate descendent lineages A node is bifurcating if it has only two immediate descendent lineages R 1 X Y Z 2 1 1 2 1 7 3

23

24 Different ways to view phylogenetic trees

25 Assessing Phylogenetic Hypotheses We use numerical phylogenetic methods because most data includes potentially misleading evidence of relationshipsWe use numerical phylogenetic methods because most data includes potentially misleading evidence of relationships Thus, we need to assess the confidence we can place in our hypothesesThus, we need to assess the confidence we can place in our hypotheses This is not always simpleThis is not always simple

26 Reliability tests Reliability refers to the probability that members of a clade will be part of the true tree

27 Bootstrapping Phylogenetic bootstrapping allows us to generate a series of pseudo-samples which we can use to estimate sampling variancePhylogenetic bootstrapping allows us to generate a series of pseudo-samples which we can use to estimate sampling variance –Random resampling (with replacement) of characters from the original data to generate pseudoreplicate data matrices identical in size to the original matrix –These replicates are subjected to the same phylogenetic searches as the original dataset Bootstrap support for a group of interest is calculated as the proportion of times the group is obtained in the replicatesBootstrap support for a group of interest is calculated as the proportion of times the group is obtained in the replicates

28

29 Exercise 1 You will use the tree output from CLUSTALW as your “first pass” phylogenetic reconstruction toolYou will use the tree output from CLUSTALW as your “first pass” phylogenetic reconstruction tool –CLUSTALW is not designed to produce publishable trees, but it does produce a reasonable representation of the relationships inferred from the multiple alignment it created We will use these cladograms and phylograms created in CLUSTALW to visualize the relationships implied from our multiple alignmentsWe will use these cladograms and phylograms created in CLUSTALW to visualize the relationships implied from our multiple alignments

30 Exercise 1, cont Go into CLUSTALW and redo your final alignments for each proteinGo into CLUSTALW and redo your final alignments for each protein –this should be simple to do as you will be using the data files you have already created –Each final alignment will include the protein sequences from 3 genomes from each of your 2 species (6 sequences) –You should end up with a total of 6 final alignments Once in CLUSTALW, run the alignment programOnce in CLUSTALW, run the alignment program –In the output, go to the bottom of the page, where the phylogram or cladogram is provided –You will want to print out the phylogram for each protein, be sure the branch lengths are included (see toggle switch to include or exclude branch lengths)

31 CLUSTAL 2.0.5 multiple sequence alignment sp1gen1 MLTTRYKLLLAAN 13 sp1gen3 MATTRYKLLLAAA 13 sp1gen2 MLTTRYKLLLAAA 13 sp2gen2 MLTTRAKLLLRRA 13 sp2gen3 MLTTRALLLLRRA 13 sp2gen1 MLTTRYKLLLRRA 13 * *** *** Sp1gen1 0.08 Sp1gen3 0.08 Sp2gen1 0.0 Sp2gen2 0.0 Sp2gen3 0.08 Sp1gen2 0.0 Protein A

32 Sp1gen1 0.0 Sp1gen3 0.0 Sp2gen1 0.4 Sp2gen2 0.0 Sp2gen3 0.08 Sp1gen2 0.08 Protein B CLUSTAL 2.0.5 multiple sequence alignment sp1gen1 MAAASSSRHLYN 12 sp1gen3 MAAASSSRHLYN 12 sp1gen2 MAASSSSRHLYN 12 sp2gen1 MAAASSSRHLYY 12 sp2gen2 MAAASSSRHLLL 12 sp2gen3 MATASSSRHLLL 12 **::******

33 Sp1gen1 0.0 Sp1gen3 0.0 Sp2gen1 0.4 Sp2gen2 0.0 Sp2gen3 0.08 Sp1gen2 0.08 Protein B Sp1gen1 0.08 Sp1gen3 0.08 Sp2gen1 0.0 Sp2gen2 0.0 Sp2gen3 0.08 Sp1gen2 0.0 Protein A

34 Exercise 2 Compare the trees produced by each protein for your two speciesCompare the trees produced by each protein for your two species –Are the branching patterns the same? –Are the branch lengths the same? Describe the general pattern that emerges and not particular exceptions to this patternDescribe the general pattern that emerges and not particular exceptions to this pattern Do your data argue for the existence of a species concept for your species?Do your data argue for the existence of a species concept for your species?

35 Exercise 3 How are we going to merge all of these data to develop some sort of overview?How are we going to merge all of these data to develop some sort of overview? –Can you envision one figure that would represent all the species and all the protein? –Do we need a different figure for each protein? –Do we need a table or graph of some sort?

36 Please BRING WITH YOU to our final lab Your taxplot figure/table and a one paragraph description of what you learned about YOUR species genomes from this exerciseYour taxplot figure/table and a one paragraph description of what you learned about YOUR species genomes from this exercise Your final alignments for each proteinYour final alignments for each protein Your phylograms (with branch lengths) for each proteinYour phylograms (with branch lengths) for each protein PRINT THESE OUT BEFORE CLASS


Download ppt "MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics."

Similar presentations


Ads by Google