Ayesha M.Khan Spring 2013. Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?

Slides:



Advertisements
Similar presentations
LG 4 Outline Evolutionary Relationships and Classification
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
An Introduction to Phylogenetic Methods
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
The Evolutionary Basis of Bioinformatics: An Introduction to Phylogenetics > Sequence 1 GAGGTAGTAATTAGATCCGAAA… > Sequence.
GENE TREES Abhita Chugh. Phylogenetic tree Evolutionary tree showing the relationship among various entities that are believed to have a common ancestor.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic Trees Systematics, the scientific study of the diversity of organisms, reveals the evolutionary relationships between organisms. Taxonomy,
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Basics of Comparative Genomics Dr G. P. S. Raghava.
Classification of Living Things. 2 Taxonomy: Distinguishing Species Distinguishing species on the basis of structure can be difficult  Members of the.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Classification systems have changed over time as information has increased. Section 2: Modern Classification K What I Know W What I Want to Find Out L.
Molecular Evolution Revised 29/12/06
© Wiley Publishing All Rights Reserved. Phylogeny.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Review of cladistic technique Shared derived (apomorphic) traits are useful in understanding evolutionary relationships Shared primitive (plesiomorphic)
Some basics: Homology = refers to a structure, behavior, or other character of two taxa that is derived from the same or equivalent feature of a common.
Bioinformatics and Phylogenetic Analysis
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Probabilistic methods for phylogenetic trees (Part 2)
Phylogenetic Analysis. 2 Phylogenetic Analysis Overview Insight into evolutionary relationships Inferring or estimating these evolutionary relationships.
Phylogenetic trees Sushmita Roy BMI/CS 576
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
Christian M Zmasek, PhD 15 June 2010.
P HYLOGENETIC T REE. OVERVIEW Phylogenetic Tree Phylogeny Applications Types of phylogenetic tree Terminology Data used to build a tree Building phylogenetic.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
COMPUTATIONAL MODELS FOR PHYLOGENETIC ANALYSIS K. R. PARDASANI DEPTT OF APPLIED MATHEMATICS MAULANA AZAD NATIONAL INSTITUTE OF TECHNOLOGY (MANIT) BHOPAL.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
17.2 Modern Classification
Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. [many slides borrowed from various sources]
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogeny Ch. 7 & 8.
PHYLOGENY AND THE TREE OF LIFE CH 26. I. Phylogenies show evolutionary relationships A. Binomial nomenclature: – Genus + species name Homo sapiens.
Phylogenetics.
Phylogeny & Systematics
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
Chapter 26 Phylogeny and Systematics. Tree of Life Phylogeny – evolutionary history of a species or group - draw information from fossil record - organisms.
Building Phylogenies Maximum Likelihood. Methods Distance-based Parsimony Maximum likelihood.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Section 2: Modern Systematics
First & Last Name August X, 2000 Evolution
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Introduction to Bioinformatics Resources for DNA Barcoding
Phylogenetic basis of systematics
Basics of Comparative Genomics
Section 2: Modern Systematics
5.4 Cladistics.
Methods of molecular phylogeny
Biological Classification: The science of taxonomy
Molecular Evolution.
Chapter 19 Molecular Phylogenetics
Phylogenetics Chapter 26.
Basics of Comparative Genomics
Presentation transcript:

Ayesha M.Khan Spring 2013

Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor? When did they separate from each other?  Phylogenetics is the study of evolutionary relationships among and within species.  Phylogenetics is the field of systematics that focuses on evolutionary relationships between organisms or genes/proteins (phylogeny). Systematics: an attempt to understand the interrelationships of living things

Phylogenetic Basics (contd.) 3 The actual pattern of evolutionary history is the phylogeny or evolutionary tree which we try to estimate. A tree is a mathematical structure which is used to model the actual evolutionary history of a group of sequences or organisms.

Phylogenetic Basics (contd.) 4 Homologues are similar sequences in two different organisms that have been derived from a common ancestor sequence. Orthologues are similar sequences in two different organisms that have arisen due to a speciation event. Orthologs typically retain identical or similar functionality throughout evolution. Paralogues are similar sequences within a single organism that have arisen due to a gene duplication event. They tend to have differing functions. Xenologues are similar sequences that do not share the same evolutionary origin, but rather have arisen out of horizontal transfer events through symbiosis, viruses, etc.

Early globin gene mouse  ß -chain gene  -chain gene cattle ß human ß mouse ß human  cattle  Orthologs (  ) Orthologs ( ß ) Paralogs (cattle) Homologs Gene Duplication Orthologs – diverged after speciation – tend to have similar function Paralogs – diverged after gene duplication – some functional divergence occurs F or linking similar genes between species, or performing “annotation transfer”, identify orthologs

Molecular phylogenetics 6 Why focus on molecular phylogenies in contrast to phylogenies which are based on characteristics like wings, feathers, etc, i.e. morphological characters? With molecular phylogenetics, the differences between organisms are measured on the proteins and RNA coded in the DNA, i.e. on amino acid and nucleotide sequences.

Molecular phylogenetics (contd.) 7 Also, molecular phylogenetics is more precise than its counterpart based on external features and behavior and can also distinguish small organism like bacteria or even viruses.  the DNA must be inherited and connects all species  the molecular phylogenetics can be based on mathematical and statistical methods and is even model-based as mutations can be modeled, remote homologies can be detected  the distance is not only based on one feature but on many genes.

Molecular Phylogeny Analysis 8 Molecular phylogeny methods allow, from a given set of aligned sequences, the suggestion of phylogenetic trees (inferred trees) which aim at reconstructing the history of successive divergence which took place during the evolution, between the considered sequences and their common ancestor. These trees may not be the same as the true tree. Reconstruction of phylogenetic trees is a statistical problem, and a reconstructed tree is an estimate of a true tree with a given topology and given branch length; In practice, phylogenetic analyses usually generate phylogenetic trees with accurate parts and imprecise parts.

Key features of molecular phylogenetic trees 9

Molecular Phylogeny Analysis (contd.) 10 Sequences reflect relationships  After working with sequences for a while, one develops an intuitive understanding that for a given gene, closely related organisms have similar sequences and more distantly related organisms have more dissimilar sequences. These differences can be quantified.  Given a set of gene sequences, it should be possible to reconstruct the evolutionary relationships among genes and among organisms.

11 Example: Pseudomonas aeruginosa- one of the top three causes of opportunistic infections, noted for its antimicrobial resistance and resistance to detergents.

Phylogenetic tree construction 12 Consider the set of sequences to analyse Align "properly" these sequences Apply phylogenetic making tree methods Evaluate statistically the obtained phylogenetic tree

13 Choose set of related sequences Obtain multiple alignment Is there a strong similarity? Maximum parsimony (strong) Distance methods (weak) Maximum likelihood (very weak) Yes No

Phylogenetic tree construction methods 14 Three categories of methods exist: distance-based, maximum parsimony, and maximum likelihood. Three categories of methods exist: distance-based, maximum parsimony, and maximum likelihood. Distance methods: evolutionary distances are computed for all sequences and build tree where distance between sequences “matches” these distances Maximum parsimony (MP): choose tree that minimizes number of changes required to explain data Maximum likelihood (ML): Creates all possible trees containing the set of organisms considered and then find the tree which gives the highest likelihood of the observed data

15 Neighbor-joining (distance-based) Maximum parsimony Maximum likelihood Very fastSlowVery slow Easily trapped in local optima Assumptions fail when evolution is rapid Highly dependent on assumed evolution model Good for generating tentative tree, or choosing among multiple trees Best option when tractable (<30 taxa, strong conservation) Good for very small data sets and for testing trees built using other methods Comparison of different tree-construction methods

Case Study I : Phylogenetic Trees Get a multiple sequence alignment C1 C2 C3 S1 A A G S2 A A A S3 G G A S4 A G A Construct a Tree using any suitable method (Parsimony, ML, etc..) 16

Evaluation For example, how confident are we that two sequences are in the same clade ? What is the probability distribution of our confidence of the branches ? Bootstrap can provide a way of determining this (first thought of by Felsenstein, 1985) 17

Bootstrap: basic idea 18 Originally, from some list of data, one computes an object. Create an artificial list by randomly drawing elements from that list. -Some elements will be picked more than once. Compute a new object. Repeat times and look at the distribution of these objects.

19

Original object O (a tree) is computed from a “list of data” (sequences) Construct a new list, with the same number of elements, from the original list by randomly picking elements from the list. Any one element from the list can be picked any number of times. Compute new object, call it O n Repeat the process many times (typically ). The elements {O 1, O 2, ……} are assumed to be taken from a statistical distribution, so one can compute averages, variances, etc. 20

A model for the bootstrap 21 The numbers at the branches are confidence values based on Felsenstein’s bootstrap method. B=200 bootstrap replications Basically, we are calculating the proportion of bootstrap trees agreeing with the original tree. ‘Agreeing’ refers to the topology of the trees

22