Download presentation

Presentation is loading. Please wait.

Published byRodney Fowler Modified over 7 years ago

1
Brandon Andrews CS6030

2
What is a phylogenetic tree? Goals in a phylogenetic tree generator Distance based method Fitch-Margoliash Method Example Verification Demo

3
Also known as an evolutionary tree Attempts to map the genetic similarity of organisms into a tree where longer branches indicate more dissimiliarity A B C B and C are similar A and B are more similar than A and C which have a longer distance

4
Given the sequences and calculated or known dissimilarity construct a tree which correctly maps this data Naïve method: Generate every possible tree and grade its quality

5
Take a distance matrix that stores the distance from every sequence to every other sequence Construct a tree which preserves these distances Most don’t 100% preserve the distances

6
Clustering algorithm that works bottom up to create an unrooted tree Weights are used to help lower the error rate for long paths

7
Calculate a distance matrix Hamming distance can be used, but a better dissimilarity function is advised ABCDE A02239 41 B00 43 C0001820 D000010 E00000

8
Add all the sequences to an array of nodes and mark them as leaves Select the closest nodes by scanning the distance matrix Those two nodes, in our example D and E will make up the two branches in a 3-branch calculation to find the branch lengths D E A, B, C d e abc dist(ABC, D) is the average distance from ABC to D Dist(ABC, E) is the average distance from ABC to E d = (dist(D, E) + (dist(ABC, D) - dist(ABC, E))) / 2; e = dist(D, E) - d; abc = dist(ABC, D) - d;

9
dist(ABC, D) and dist(ABC, E) Calculate by taking the distance from each of the elements A, B, and C and averaging them d = (10 + (32.6… - 34.6…)) / 2 = 4 e = 10 - 4 = 6 abc = 32.6… - 4 = 28.6… ABCDE 032.6…34.6… D0010 E000

10
Now we can create a new node with distance 28.6… and set D and E to their respective distances Since D and E are leaves their distance are kept. However, if they weren’t then the average of the child distances would be subtracted as seen later D E A, B, C 4 6 28.6…

11
The final step in this iteration is to recalculate the nodes and distance matrix The nodes array has the new merged node DE appended to the end and D and E are removed The distance matrix is updated with DE merged and D and E are removed: ABCDE A0223940 B004142 C00019 DE0000

12
Look at the new distance matrix find the closest pair, C and DE Now there is a special step. C is a leaf so it gets the calculated distance DE is not a leaf so we need to subtract from DE the average child distance C DE A, B c de ab dist(AB, C) is the average distance from AB to C Dist(AB, DE) is the average distance from AB to DE c = (dist(C, DE) + (dist(AB, C) - dist(AB, DE))) / 2; de = dist(C, DE) - c; ab = dist(AB, C) - c;

13
Merging A and B to calculate the average distance to C and DE. dist(AB, C) dist(AB, DE) ABCDE AB04041 C0019 DE000

14
Average child distance example Recursively take the average of each branches ((5 + ((2 + (4 + 6) / 2) + 3) / 2) + 1) / 2 = 5.5 4 6 3 1 2 5

15
So for DE which has two child nodes we need to subtract the average of the children. Since DE has two leaf nodes we perform: (4 + 6) / 2 = 5 So now we calculate c, de, and ab: c = (dist(C, DE) + (dist(AB, C) - dist(AB, DE))) / 2 = (19 + (40 – 41)) / 2 = 9 de = dist(C, DE) – c – AverageDistance(DE) = 19 – 9 – (4 + 6) / 2 = 5 ab = dist(AB, C) – c = 40 – 9 = 31 Notice that the distance at de replaces whatever was previously there

16
With the new node added: Recalculated distance matrix: C A, B 9 5 31 D E 4 6 ABCDE A02239.5 B0041.5 CDE000

17
As before choose the next closest nodes by looking at the distance matrix A and B are chosen Now a and b can be calculated since they are leaves, but notice we’re linking two trees at cde, so we need a special step to subtract the average distance A CDE a b cde B dist(CDE, A) is the average distance from CDE to A Dist(CDE, B) is the average distance from CDE to B a = (dist(A, B) + (dist(CDE, A) - dist(CDE, B))) / 2 = 10 b = dist(A, B) - c = 12 cde = dist(CDE, A) - a = 29.5

18
So 29.5 - AverageDistance(CDE) 29.5 - ((5 + (4 + 6) / 2) + 9) / 2 = 29.5 - 9.5 = 20 C A, B 9 5 D E 4 6 A CDE 10 12 cde B 29.5 C 9 5 D E 4 6 A 10 12 B 20

19
So we have a completely defined unrooted tree. How do we root it? Just take the last branch and divide it by two C 9 5 D E 4 6 A 10 12 B 10

20
Original: From the generated tree: Exact match Rare to happen Usually off by a small amount ABCDE A02239 41 B00 43 C0001820 D000010 E00000 ABCDE A02239 41 B00 43 C0001820 D000010 E00000

21
http://sirisian.com/javascript/CS6030Project.html

22
Distance based methods such as the Fitch-Margoliash method produce very accurate trees given an accurate distance matrix in a very timely manner

23
Bacardit, J., Krasnogor, N. Phylogenetic Trees [PPT document]. Retrieved from http://www.cs.nott.ac.uk/~jqb/G53BIO/Slides/Phylogenetic%20Trees.ppt Louhisuo K. (2004, May 4). Constructing phylogenetic trees with UPGMA and Fitch- Margoliash. Retrieved from http://www.niksula.cs.hut.fi/~klouhisu/Bioinfo/phyltree.pdf

Similar presentations

© 2023 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google