Phylogeny GENE 3000. why is coalescent theory important for understanding phylogenetics (species trees)? coalescent theory lets us test our assumptions.

Slides:



Advertisements
Similar presentations
EVIDENCE OF EVOLUTION.
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Probabilistic Modeling of Molecular Evolution Using Excel, AgentSheets, and R Jeff Krause (Shodor)
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic Trees Systematics, the scientific study of the diversity of organisms, reveals the evolutionary relationships between organisms. Taxonomy,
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Classification of Living Things. 2 Taxonomy: Distinguishing Species Distinguishing species on the basis of structure can be difficult  Members of the.
Phylogenetic reconstruction
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Evolution Revised 29/12/06
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
UPGMA and FM are distance based methods. UPGMA enforces the Molecular Clock Assumption. FM (Fitch-Margoliash) relieves that restriction, but still enforces.
Review of cladistic technique Shared derived (apomorphic) traits are useful in understanding evolutionary relationships Shared primitive (plesiomorphic)
Maximum Likelihood Flips usage of probability function A typical calculation: P(h|n,p) = C(h, n) * p h * (1-p) (n-h) The implied question: Given p of success.
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
Probabilistic methods for phylogenetic trees (Part 2)
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Phylogenetic trees Sushmita Roy BMI/CS 576
Scientific FieldsScientific Fields  Different fields of science have contributed evidence for the theory of evolution  Anatomy  Embryology  Biochemistry.
What Is Phylogeny? The evolutionary history of a group.
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
GENE 3000 Fall 2013 slides More geologists agree that the age of the Earth is ~4.5 billion years old geneticists have independent data suggesting.
PHYLOGENETICS CONTINUED TESTS BY TUESDAY BECAUSE SOME PROBLEMS WITH SCANTRONS.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Announcements Urban Forestry project starts this week. Go through protocol. We'll be sending you off on your own. Please act responsibly. Peer review of.
Evolutionary Biology Concepts Molecular Evolution Phylogenetic Inference BIO520 BioinformaticsJim Lund Reading: Ch7.
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Genomic diversity and differentiation heading toward exam 3.
GENE 3000 Fall 2013 slides wiki. wiki. wiki.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Introduction to Phylogenetic trees Colin Dewey BMI/CS 576 Fall 2015.
Phylogenies Reconstructing the Past. The field of systematics Studies –the mechanisms of evolution evolutionary agents –the process of evolution speciation.
Why phylogenetics? Barbara Holland School of Physical Sciences University of Tasmania.
Phylogeny Ch. 7 & 8.
Chapter 22 Descent with Modification: A Darwinian View.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
Chapter 26 Phylogeny and Systematics. Tree of Life Phylogeny – evolutionary history of a species or group - draw information from fossil record - organisms.
Classification Biology I. Lesson Objectives Compare Aristotle’s and Linnaeus’s methods of classifying organisms. Explain how to write a scientific name.
Chapter 26 Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
Thursday, October Writing assignment: (Darwinism.
EVIDENCE OF EVOLUTION.
The Making of the Fittest Evidence of Evolution youtube
Patterns in Evolution I. Phylogenetic
Systematics: Tree of Life
Summary and Recommendations
Systematics: Tree of Life
Chapter 19 Molecular Phylogenetics
Unit Genomic sequencing
Phylogeny and the Tree of Life
Summary and Recommendations
But what if there is a large amount of homoplasy in the data?
1 2 Biology Warm Up Day 6 Turn phones in the baskets
Presentation transcript:

phylogeny GENE 3000

why is coalescent theory important for understanding phylogenetics (species trees)? coalescent theory lets us test our assumptions of how DNA sequences evolve before we use them to reconstruct phylogeny coalescent theory explains why recently- diverged populations may not yet have synapomorphies despite already being on different evolutionary paths this model gives us basis for estimating time to ancestor of any two sequences, within a population or between species

DNA characters are just like phenotypic characters 4 character states A,C,T,G plus information in insertion-deletion, gene copy number, etc. same concerns of homology and shared descent apply

human population isolated ~200kya “mitochondrial Eve” sets up misunderstanding every locus sampled now has a point in the past where all current alleles coalesce to a common ancestor in recently diverged species, diversity is often older than the species

understanding coalescence 1. larger effective size (Ne), more diversity 2. when time between branching events short relative to Ne, more likely that allelic diversity is older than branching event Ne isolation

"This coalescence does not mean that the population originally consisted of a single individual with that ancestral allele. It just means that particular individual’s allele was the one that, out of all the alleles present at that time, later became fixed in the population." National Geographic OCT. 2013

phylogeny inference 2 basic approaches: algorithm vs. criterion “neighbor joining” shown in book is an algorithm that generates a single tree by finding shortest “distances” (proportion of differences at nucleotide sites) algorithm approaches do not help identify our uncertainty: one answer comes out, whether well supported or not

criterion-based phylogeny 30 tips results in 8.7 x possible trees computer search necessary

A phylogeny is a hypothesis.

3 of >10,000 possible trees which fits data best? depends on the criterion

3 of >10,000 possible trees which fits data best? depends on the criterion 11 changes 7 changes = most parsimonious of these 3

Take out paper... Quick. Draw a phylogeny with 7 tips. Without thinking add: baboon, hamster, mouse, chicken, human, cow, sheep as randomly as possible (Not this, draw your own. With 7 tips.)

Just scoring these Characters in the most parsimonious way What score do you get?

criteria used in phylogeny parsimony - the fewest # of changes indicates the most acceptable tree topology maximum likelihood - both topology (arrangement of branches) and branch lengths are iteratively searched for tree(s) that fit statistical model of molecular evolution (e.g. transitions > transversions) Bayesian - criterion is still maximum likelihood, search strategy is different (sums result over many similar-likelihood trees)

green fluorescent protein has evolved to be more than green the nucleotides are not what we are interested in. we are interested in how traits that affect fitness, ecology, speciation, performance, evolve along a phylogeny

why different criteria? 1. we are making our assumptions explicit for inference of the unknown 2. different scientists have different backgrounds that drive their assumptions 3. using multiple methods/criteria lets us test how safe our assumptions are

are your data sufficient? all of these methods will find a tree: whether algorithm or criterion-based search is it one you can believe is better than random? is it one you would put your name behind? bootstrapping, and consensus methods

bootstrapping text: “statistical method for estimating the strength of evidence that a particular node in a phylogeny exists” more general: resampling technique used to obtain estimates of... parameters and accuracy/variance around those parameters observed data: is a particular subset of the data driving the result of our analysis?

mean vs. median 1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,15 mean is median is 4. you can change 15 to 150 and mean goes up but median doesn’t change, more robust random resample of data with replacement (means same data entry can be used multiple times) can identify true tendency of data, help ignore ‘outliers’

higher bootstrap proportions better! this value is % of “pseudo” replicates that divide tree in same way (represents data tendency to support node)

consensus tree can also ask equally- supported trees (equally parsimonious, equal likelihood) how well they all support same nodes doesn’t have to involve subset of data like in bootstrap may summarize the stable parts of tree across 2+ trees bacde bacde bacde

support for the method do we believe phylogeny reconstruction works? need to test it against a known history (fish(salamander(bird(mouse,human)) we feel pretty strongly about experimental phylogenetics uses virus evolution to go one step further

experimental evolution growing T7 phage on E. coli plates; speed up mutation process by adding mutagen 40 generations 40 generations 40 generations

experimental evolution so phylogeny is known, and ancestral strains can be kept in freezer sequence part of DNA and use parsimony, likelihood, and other approaches consistently got the right (TRUE) answer! can also track “traits” on this tree, e.g. changes in growth rate and plaque size on E. coli plates (and check against actual ancestors)

# DNA mutations on this branch Text: “Because constructing phylogenies, and science more broadly, is often a process of evaluating evidence, scientists often test the effectiveness of the methodologies used to draw conclusions.”

well-supported phylogeny of rabies virus lineages, coded by host bat species

For RNA viruses, rapid viral evolution and the biological similarity of closely related host species have been proposed as key determinants of the occurrence and long-term outcome of cross-species transmission. Using a data set of hundreds of rabies viruses sampled from 23 North American bat species, we present a general framework to quantify per capita rates of cross-species transmission and reconstruct historical patterns of viral establishment in new host species using molecular sequence data. These estimates demonstrate diminishing frequencies of both cross-species transmission and host shifts with increasing phylogenetic distance between bat species. Evolutionary constraints on viral host range indicate that host species barriers may trump the intrinsic mutability of RNA viruses in determining the fate of emerging host-virus interactions. analysis indicates rate of virus jumping from one host to another

so this study requires TWO phylogenies (virus and bats) CST: cross-species transmission