1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.

Slides:



Advertisements
Similar presentations
LG 4 Outline Evolutionary Relationships and Classification
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Phylogenetic reconstruction
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Classification and phylogeny
Chapter 2 Opener How do we classify organisms?. Figure 2.1 Tracing the path of evolution to Homo sapiens from the universal ancestor of all life.
Phylogenetic Analysis. 2 Phylogenetic Analysis Overview Insight into evolutionary relationships Inferring or estimating these evolutionary relationships.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Phylogenetic trees Sushmita Roy BMI/CS 576
D.5: Phylogeny and Systematics
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
Molecular Systematics
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
BINF6201/8201 Molecular phylogenetic methods
Phylogenetics and Coalescence Lab 9 October 24, 2012.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
GENE 3000 Fall 2013 slides wiki. wiki. wiki.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Why do trees?. Phylogeny 101 OTUsoperational taxonomic units: species, populations, individuals Nodes internal (often ancestors) Nodes external (terminal,
Phylogeny Ch. 7 & 8.
Phylogeny & the Tree of Life
PHYLOGENY AND THE TREE OF LIFE CH 26. I. Phylogenies show evolutionary relationships A. Binomial nomenclature: – Genus + species name Homo sapiens.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Phylogenetics.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Ch. 26 Phylogeny and the Tree of Life. Opening Discussion: Is this basic “tree of life” a fact? If so, why? If not, what is it?
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Phylogeny & Systematics The study of the diversity and relationships among organisms.
What is phylogenetic analysis and why should we perform it? Phylogenetic analysis has two major components: (1) Phylogeny inference or “tree building”
Bioinformatics Lecture 3 Molecular Phylogenetic By: Dr. Mehdi Mansouri Mehr 1395.
Introduction to Bioinformatics Resources for DNA Barcoding
Phylogenetic basis of systematics
Phylogeny & the Tree of Life
Reading Cladograms Who is more closely related?
In-Text Art, Ch. 16, p. 316 (1).
Phylogenetic Inference
26.3 Shared Characters Are Used To Construct Phylogenetic Trees
Goals of Phylogenetic Analysis
Biological Classification: The science of taxonomy
Biological Classification: The science of taxonomy
Phylogeny and the Tree of Life
Patterns in Evolution I. Phylogenetic
Systematics: Tree of Life
Molecular Evolution.
Summary and Recommendations
D.5: Phylogeny and Systematics
Systematics: Tree of Life
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Chapter 20 Phylogenetic Trees. Chapter 20 Phylogenetic Trees.
Phylogeny and Systematics (Part 6)
Phylogenetics Chapter 26.
Molecular data assisted morphological analyses
Summary and Recommendations
But what if there is a large amount of homoplasy in the data?
1 2 Biology Warm Up Day 6 Turn phones in the baskets
Presentation transcript:

1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral Points About Phylogenetic Trees Phylogenetic AnalysesPhylogenetic Analyses  The importance of Alignments  The different analysis methods  Tree confidence measures

2 Tree Terminology Node: point at which 2 or more branches diverge Internal node: hypothetical last common ancestor Terminal node: molecular or morphological data from which the tree is derived. (These will often be used to represent species or individual specimens and may be referred to as OTUs = Operational Taxonomic Units) Clade: a node (hypothetical ancestor) and all the lineages descending from it internal node terminal node or OTU internal node terminal node or OTU clade

3 Tree Terminology Monophyletic group: a group in which all members are derived from a unique common ancestor Polyphyletic group: a group in which all members are not derived from a unique common ancestor. The common ancestor of the group has many descendants that are not in the group Paraphyletic group: a group that excludes some of the descendants of the common ancestor (a form of polyphyly)

4 General Points About Phylogenetic Trees All branches can rotate freely around a node (i.e. B is not more closely related to C than A, and C is not more closely related to D than E) A B C D E Branch lengths may be be drawn as equal between nodes – “cladograms” (see tree above) (these are used when one is interested only in the branching pattern) Branch lengths may be proportional to the hypothesized distance between nodes – “phylogram” (see tree on left) A B C D E

5 polytomy General Points About Phylogenetic Trees polytomy Fully resolved trees are bifurcating (only two decendant lineages from nodes) A node with more than two decendant lineages is a multifurcating node or a polytomy. Polytomies may be “soft” or “hard” “Soft” = product of data or analysis “Hard” = product of biology

6 polytomy General Points About Phylogenetic Trees Example of a “soft” polytomy: LSU analysis is unable to resolve the relationships of some Ptilophora species. LSU tree Using different data (rbcL) the relationships among Ptilophora species are better resolved. rbcL tree Tronchin et al. 2004

7 Phylogenetic Analyses The Importance of Alignments Phylogenetic trees derived from the analysis of DNA or amino acid sequences are only as good as the data they are based upon. Garbage In = Garbage Out Consequently, sequence alignment is the most important step in phylogenetic analysis. The aligned sites of a sequence must be homologous (or identical by decent = taxa share the same state because their ancestor did). If two taxa share the same state but not by decent it is called homoplasy

8 The Importance of Alignments Phylogenetic Analyses same sites in different sequences need to be homologous inferred insertion/deletion mutations (gaps) area to possibly remove from analyses because of uncertain homology between sites DNA sequences are prone to homoplasy because there are only 4 possible sites (and insertion/deletion mutations[indels] for some loci).

9 Phylogenetic Analyses The Different Analysis Methods See: evolution.genetics.washington.edu/phylip/software.html#methods for a list of software programs Distance methods: based on similarity between OTUs UPGMA – originally used for phenotypic characters in numerical taxonomy. Generally not applied to sequence data because it is highly sensitive to mutation rate changes in lineages, i.e. the data must fit a “molecular clock.” NJ (Neighbor Joining) – algorithm method that will find the “minimum evolution” tree without examining all possible topologies. The accuracy of a distance tree depends on 2 things: 1)How “true” are the distances calculated between taxa (how good is the model of evolution that your distances are based upon). 2) The standard error of the distance measure estimation

10 Phylogenetic Analyses The Different Analysis Methods Optimization methods Parsimony: searching for the tree that requires the least number of mutational steps i.e. the simplest is the best. Maximum Likelihood: searching for the most likely tree (the tree with highest probability) given the OTUs (sequences) and model of evolution i.e. the tree that maximizes the probability of observing the data is the best tree. Bayesian: searching for the best set of trees i.e. the set of trees in which the likelihoods are so similar that changes between them are essentially random.

11 Phylogenetic Analyses Tree Confidence Measures Decay Analysis or Goodman-Bremer Support Values: a test used in parsimony analyses where one determines how many steps less parsimonious than minimal, is a particular branch in your tree no longer resolved in the consensus of all possible trees that length. Most parsimonious tree L = 35 One step less parsimonious L = 36 Two steps less parsimonious L = 37 d1 d2 How meaningful the values are may depend on the tree length.

12 Phylogenetic Analyses Tree Confidence Measures Bootstrapping: A non-parametric test of how well the data support the nodes of a given tree. Determining support is a bit of a statistical problem: Evolution only happened once so there is no underlying distribution to sample in order to develop confidence values. Method: the original analysis is performed multiple times on pseudo-datasets derived by sampling the original dataset with replacement. The number, or fraction, of times that a particular clade is present in the resulting trees is its boostrap value. Bootstrapping is not portable i.e. you can not compare values across studies because changing any parameters will change the values.

13 Tree Confidence Measures Bootstrapping By default most programs will show bootstrap values when they are greater than 50 but, does a bootstrap value of 50 mean anything? For a discussion of this see Hillis & Bull (1993) Systematic Biology 42: (they tested bootstrap values based on a known phylogeny). Wilson’s General Rule: 60-80, is there other evidence to support the relationship, be cautious; 80-90, usually pretty solid; , solid and unlikely to be misleading. Phylogenetic Analyses

14 General Points About Phylogenetic Trees DNA or protein sequence trees are hypotheses of how a particular DNA locus or protein has evolved. We assume that the way the DNA or protein has evolved reflects the way the species has evolved i.e. gene tree = species tree IMPORTANT: This may or may not reflect reality. i.e. You Still Have To Think as molecules do not necessarily trump morphology, development, etc.

15 General Points About Phylogenetic Trees gene tree = species tree gene tree species tree gene tree = species tree AABB C C