Download presentation
Presentation is loading. Please wait.
1
Methods of molecular phylogeny
Peter Norberg
2
Content Introduction to Evolution and taxonomy Phylogenetic analysis
Algorithmics Applied phylogenetics Computer Software Practical session
3
Evolution Charles Darwin ”Tree of life” Phylogenetic tree
Root = Ancestor to all species
4
Rooted or unrooted trees?
Trees show evolutionary relationships The root shows direction
5
Different representations
B C D A B C D A B C D A B C D A B C D
6
Trees can be based on: Outer appearances (example shape of bills)
Functionality Complexity A combination of… ……….. ….. DNA, RNA, AA, gene order….
7
Phylogenetic trees based on DNA
AATTGGCC AATAGGCC AATAGGCA AGTTGGCG AATAGGAC AATAGGCA AGTTGGCG TATTGGCG AATAGGAC TATTGGCG AATTGGCG
8
Phylogenetic trees based on DNA
AATTGGCC AATAGGCC AATAGGAC AATTGGCG AGTTGGCG TATTGGCG AATAGGCA AATAGGAC AATAGGCA AGTTGGCG TATTGGCG
9
Genomic region Same genomic region for all taxa! Not too similar
Not too diverged Insertions/deletions
10
Sequence alignment Aligned: Not aligned: (1) AATGGCAACCGCATTCAGGATTTAA
(3) ATGGTAACCGCATTGAGGATTTAA (2) AATGGTAACCGCAAGGATTTAA (5) TGGTAACCGCATTCAGGAATTAA (4) AATGGTAACCGCATTCAGGAATTA Aligned: Not aligned: (1) AATGGCAACCGCATTCAGGATTTAA (1) AATGGCAACCGCATTCAGGATTTAA (2) AATGGTAACCGCAA GGATTTAA (2) AATGGTAACCGCAAGGATTTAA (3) ATGGTAACCGCATTGAGGATTTAA (3) ATGGTAACCGCATTGAGGATTTAA (4) AATGGTAACCGCATTCAGGAATTA (4) AATGGTAACCGCATTCAGGAATTA (5) TGGTAACCGCATTCAGGATTTAA (5) TGGTAACCGCATTCAGGATTTAA
11
Sequence alignment, our example
AATTGGCC AATAGGCC AATAGGCA AGTTGGCG AATAGGAC TATTGGCG AATTGGCG AATTGGCC AATTGGCC AATAGGCC AATAGGCC AATTGGCG AATTGGCG AATAGGAC AATAGGAC AGTTGGCG AGTTGGCG TATTGGCG TATTGGCG AATAGGCA AATAGGCA
12
Phylogenetic principles
Similar DNA sequences = closely related Inherited mutations. Simplest “route”! Homoplasy unlikely (not always true).
13
Homology vs. homoplasy Homology = similarity due to a common ancestor
Homoplasy = similarity due to convergent evolution, but independent origins
14
Algorithms for constructing phylogenetic trees
What is an algorithm? Several different phylogenetic algorithms exist. How do they work?
15
Algorithms for constructing phylogenetic trees
Distance matrices Neighbour Joining UPGMA Maximum Parsimony Maximum Likelihood Bayesian inference
16
Distance matrices Based on the genetic distance
Genetic distance based on nucleotide substitutions Typically # of differences / totalt # of nt AATTCCGG AATACCGG AATTAATG 1 2 3 1 0 2 1 0 1 0
17
Neighbour Joining Cluster in pairs Shortest distance first
=> Similar sequences located closely together in the tree Fast algorithm! 1 0 2 1 3 A B C D
18
Maximum Parsimony Utilizes so-called informative sites.
Simplest path (fewest mutations) Build all possible trees. Choose the tree, which requires the fewest mutations Relatively fast
19
Maximum Parsimony, example
1 2 3 4 a 1 2 3 4 a AATTCC AAGTCC AAGTCT 1 3 2 4 a a a 1 2 4 3 a a 1 2 3 4 a 1 2 3 4 a 1 4 2 3 a
20
Maximum Likelihood and Bayesian inference
Statistical method including an evolutionary model Summarize the likelihood for all columns Calculate the likelihood for all possible trees Good but slow! Bayesian inference faster
21
To test all possible trees
Is it possible? => Takes too much time!!!! To analyze 20 taxa gives ~1022 different possible trees ( ) What to do? => Use sophisticated algorithms to limit the search space….. Usually produce good results, but not necessarily the best
22
To root an unrooted tree
Include an “outgroup” Outgroup = more distantly related (but not too distantly) Place the root where the outgroup connects to the tree
23
Rooting a tree outgroup A F B D A F C D B C E E G G
24
Significance Is the tree reliable? Is it the only probable?
Bootstrap, Jack knife etc.
25
Bootstrap Construct several new sequence sets (1000 st.)
A new sequence set is generated by randomly picking of columns from the original set Apply the phylogenetic algorithm on all sets. Make one consensus tree from all trees
26
Bootstrapping A: AACTTAACCACGCTATCGATGCAATTATATA
B: AATTTGACTGCGGTACCGATCCAATTATATA C: AATTTGACTGGGCTACCGATCCAATTATATA D: AACTTAACCGCGCTACTGATCGAATTATATA A: CACC B: TGCT C: TGCT D: CAGC A D B C A C B D A B C D 96 1 3 96 1 3
27
Pitfalls? Homoplasy (convergent evolution) - Selection pressure
Hyper variable regions Random events Gene duplication Recombination - Different regions have different ancestries
28
Recombination A B Recombination Recombinants
29
Detection of recombinants
H X C A D E H B I F G
30
Detection of recombinants
H X A B C D E F G H I A B C D E F G H I
31
Phylogenetic networks
A B C D A B C D R A B C D R A B C D R A B C D R
32
Applied phylogenetics
Reconstruct evolutionary history Animals, plants, bacteria, viruses, plasmids, …… Establish evolutionary mechanisms Functional studies Trace pandemic diseases Forensic medicine
33
Examples
37
Practical session
38
Phylip Software package for phylogenetic analysis
Several small (command-line) applications Many different algorithms Widely used by the scientific community seqboot -> Constructs bootstrap sets dnapars -> Constructs a maximum parsimony tree consence -> Constructs a consensus tree drawtree -> Draws the tree
39
Herpes Simplex Virus Type 1 & 2
Usually asymptomatic Cause oral and genital lesions, encephalitis, meningitis and keratitis Transferred via direct contact Life long infection in the sensorial ganglia HSV-1: 70-80%, HSV-2: 20-30% ~100 nm in diameter. Capsid surrounded by envelope. Different glycoproteins in envelope. Photo by Linda M. Stannard, University of Cape Town.
40
HSV-1 US7 (Glycoprotein I)
41
Clinical samples
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.