3- RIBOSOMAL RNA GENE RECONSTRUCITON  Phenetics Vs. Cladistics  Homology/Homoplasy/Orthology/Paralogy  Evolution Vs. Phylogeny  The relevance of the.

Slides:



Advertisements
Similar presentations
Ortholog vs. paralog? 1. Collect Sequence Data Good Dataset
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Phylogeny and Systematics
Phylogenetic reconstruction
Comparative genomics Joachim Bargsten February 2012.
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
© Wiley Publishing All Rights Reserved. Phylogeny.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Some basics: Homology = refers to a structure, behavior, or other character of two taxa that is derived from the same or equivalent feature of a common.
Bioinformatics and Phylogenetic Analysis
BME 130 – Genomes Lecture 26 Molecular phylogenies I.
Bas E. Dutilh Phylogenomics Using complete genomes to determine the phylogeny of species.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
With astonishing advance of the Human Genome Project, essentially all human genomic sequences are available in public databases. The major task for the.
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Phylogenetic Analysis. 2 Phylogenetic Analysis Overview Insight into evolutionary relationships Inferring or estimating these evolutionary relationships.
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogenetic Analysis
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Terminology of phylogenetic trees
Christian M Zmasek, PhD 15 June 2010.
Species  OTUs  OPUs  Species  OTUs  OPUs. Rosselló-Mora & Amann 2001, FEMS Rev. 25:39-67 Taxa circumscription depends on the observable characters.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
3- NON-RIBOSOMAL GENE RECONSTRUCTION  Core / auxiliary / strain specific genes  Housekeeping genes and accordance with global reconstruction  MLSA 
 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Christian Rinke Microbial Genomics DOE, Joint Genome Institute Introduction to ARB (From A User's Perspective)
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
SPECIES AT THE GENOMIC LEVEL. DDH has been the gold standard  the “sex” for higher eukaryotes Stackebrandt et al., 2002, Int J Syst Evol Microbiol. 52:
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Why do trees?. Phylogeny 101 OTUsoperational taxonomic units: species, populations, individuals Nodes internal (often ancestors) Nodes external (terminal,
Phylogeny Ch. 7 & 8.
Phylogenetics.
Phylogeny & Systematics
Classification and Phylogenetic Relationships
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
ASSEMBLY AND ALIGNMENT-FREE METHOD OF PHYLOGENY RECONSTRUCTION FROM NGS DATA Huan Fan, Anthony R. Ives, Yann Surget-Groba and Charles H. Cannon.
Chapter 26 Phylogeny and Systematics. Tree of Life Phylogeny – evolutionary history of a species or group - draw information from fossil record - organisms.
Building Phylogenies Maximum Likelihood. Methods Distance-based Parsimony Maximum likelihood.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Phylogenetic trees. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Bioinformatics Overview
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenic trees..
Phylogenetics
The ideal approach is simultaneous alignment and tree estimation.
Phylogenetic Inference
Methods of molecular phylogeny
Molecular Evolution.
Chapter 19 Molecular Phylogenetics
Phylogenetics Chapter 26.
Phylogenetic tree based on 16S rRNA gene sequence comparisons over 1,260 aligned bases showing the relationship between species of the genus Actinomyces.
Phylogenetic tree of 38 Pseudomonas type strains, based on the V3-V5 region sequence of the 16S rRNA gene (V3 primer, positions 442 to 492; and V5 primer,
Unit Genomic sequencing
Phylogenetic comparison among selected Pasteurella multocida and Haemophilus influenzae species with completed genome sequences. Phylogenetic comparison.
Unrooted neighbor-joining tree of 16S rRNA gene sequences from low-G+C-content gram-positive bacteria, obtained from clone libraries. Unrooted neighbor-joining.
Presentation transcript:

3- RIBOSOMAL RNA GENE RECONSTRUCITON  Phenetics Vs. Cladistics  Homology/Homoplasy/Orthology/Paralogy  Evolution Vs. Phylogeny  The relevance of the alignment  The algorithms  Bootstrap  One tree is no tree

Phylogenetic coherence (monophyly) phylogenetic coherence RNAr 16S Functional genes (MLSA) Genomic analyses 70-50% 70% genomic coherence Reasociación DNA-DNA G+C, AFLP, MLSA Genomic comparisons (ANI; AAI) 100% 60% 70% 80% 50% phenotypic coherence metabolismchemotaxonomyspectrometry (Maldi-Tof; ICR-FT/MS)  Generally based on 16S rRNA gene analysis  important to recognize the closest relatives by means of the Type Strain gene sequences  Housekeeping genes (MLSA approach or single gene) may help in resolve phylogenies  Future perspectives will be done with full-genome sequences

Similarity matrix or alignment OTU A OTU B OTU C OTU D OTU E … PHENETICS CLADISTICS Phenetics vs Cladistics  Data can be treated as presence/absence/intensity to generate similarity matrices  If data is analyzed by their similarity  PHENETICS  If data is analyzed in an evolutionary context (i.e. changes in homologous characters are mutations or evolutive steps)  CLADISTICS  For evolutive purposes is necessary to recognize HOMOLOGY

HOMOLOGY  ORTOLOGY  PARALOGY  HOMOPLASY Organism A Gene X Organism B Gene X Organism A Gene X Gene X’ Gene X’’ Homology  same ancestral origin Homoplasy  false homology Orthology  homologous genes in different organisms Paralogy  homologous genes in the same organism, gene duplications with identical or different function

HOMOLOGY  ORTOLOGY  PARALOGY  HOMOPLASY Organism A Gene X Organism B Gene X Organism A Gene X Gene X’ Gene X’’ Orthology  homologous genes in different organisms Paralogy  homologous genes in the same organism, gene duplications with identical or different function Homoplasy (false homology) Homology (same ancestral origin)

Evolution ≠ phylogeny Evolution => mutations (morphometrics) + age (fossil record) Phylogeny = genealogy => we know only the tips of the tree, nothing is said about putative ancestors PROKARYOTES => no fossil record => molecular clocks Molecular clocks (housekeeping genes): 16S rRNA; 23S rRNA; ATPases; TU-elongation factor; gyrases… The 16S rRNA:  Universally represented  Conserved  No protein coding  Base pairing (helix)  Natural amplification  Proper size Evolution vs. Phylogeny Ludwig and Schleifer, 1994 FEMS Rev 15:

The relevance of the alignment To perform cladistic analyses we should first align al sequences in order to recognize all homologous positions. Recognition by:  Sequence similarities  Base pairing due secondary structure (helixes for rRNA)  Insertions & deletions  Empirically (subjective)  Minimize homoplasic influences There are many alignment programs, all look to common features that may indicate homologous sites:  Clustal X  MAFFT  PileUp  …

The relevance of the alignment Most of the programs do not take into account secondary structure, just sequence motive similarities rRNA has a secondary structure with helixes that help in aligning sequences Functional gene or translated proteins cannot be improved by secondary structure analysis

The relevance of the alignment ARB does take into account features as helix pairing By increasing the numbers of sequences, the alignment improves

T C GA transitions transversions Maximum Likelihood Like Maximum Parsimony but takes into account  difficulties in mutation events (transitions vs. transversions)  mutation position  Slower Maximum Parsimony G C C A T => a G C A C T => b G C A C C => c a – b => 2 mutations a – c => 3 mutations b – c => 1 mutation bcaacba b c (pitfalls: nature may not be parsimonious) a => 0 b => 400 c => abc ab c Neighbor Joining: G C C A T => a G C A C T => b G C A C C => c a => 100 b => c => abc alignment Similarity matrix Distance matrix dendrograms Distance transformation Jukes-CantorKimura De Soete (pitfalls: does not take into account multiple mutations) abc The algorithms

Bootstrap indicates how stable is a branching order when a given dataset is submitted to multiple analysis Generally short internode branches will have low bootstrap values Bootstrap

PHYLOGENETIC FILTERS  TERMINI  42,284 homologous positions  BACTERIA  1,532 homologous positions  30%  1,433 homologous positions  50%  1,288 homologous positions

NJ_bac NJ_30% NJ_50% USE OF PHYLOGENETIC FILTERS  Conservational filters are useful for deep- branching phylogenies  complete sequences are useful for close relative organisms

Size & information content  complete sequences give complete information  partial sequences lose phylogenetic signal  short sequences lose resolution 1500 nuc 900 nuc 300 nuc

One tree is no tree PAR NJ RaXML  different algorithms  different topologies  try different datasets as well  draw a consensus tree

RECOMMENDATIONS FOR 16S rRNA TREE RECONSTRUCTION B A D C F E 100 I H G B A D C F E I H G Tree with bootstrapTree with multifurcation  SEQUENCE  almost complete is better than short partial sequences  ALIGNMENT  Better take into account secondary structures  ALGORITHM  Better maximum likelihood, but compare with other as neighbor joining and maximum parsimony  DATASET  Never just one dataset, try different sets of data (i.e. different number of sequences; different filters to find the best resolution)  FINAL TREE  Either you show all trees, or the best bootstrapped, or a multifurcation showing unresolved branching order.

MLSA: phylogenetic reconstructions MULTIPLE SEQUENCE ALIGNMENTS  sometimes have better resolution than the 16S rRNA gene  16S rRNA gene can have very low resolution Jiménez et al., 2013, System Appl Microbiol, 36: