Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fitch-Margoliash Algorithm 1.From the distance matrix find the closest pair, e.g., A & B 2.Treat the rest of the sequences as a single composite sequence.

Similar presentations


Presentation on theme: "Fitch-Margoliash Algorithm 1.From the distance matrix find the closest pair, e.g., A & B 2.Treat the rest of the sequences as a single composite sequence."— Presentation transcript:

1

2

3 Fitch-Margoliash Algorithm 1.From the distance matrix find the closest pair, e.g., A & B 2.Treat the rest of the sequences as a single composite sequence. Calculate the average distance from A to all of the other sequences and B to all of the other sequences 3.Use these values to calculate the distances a and b between A and the joining common node to B and the same for B. 4.Take A and B as a single composite sequence AB, calculate the average distances between AB and each of the other sequences, and make a new distance table from these values. 5.Indentify the next pair of most closely related sequences and proceed as in step 1 to calculate the next set of branch length. 6.When necessary substract extended branch lengths to calculate lengths of intermediate branches. 7.Repeat the entire procedure starting with all possible pairs of sequences A and B, A and C, A and D, etc 8.Calculate the predicted distances between each pair of sequences for each tree to find the tree that best fits the original data

4

5 D and E are the closest sequences D E A-C b a c DE -32.634.6 D-10 E- a = 4 b = 6 c = 29 Now let’s recompute the complate distance matrix ABCDE A-223940 B-4142 C-19 DE-

6 C and DE are the closet sequences C A-B b a c ABCDE AB-40412 C-19 DE- a = 9 b = 10 c = 31 Now let’s recompute the complate distance matrix DE b is not just for that segment, it represents the complete distance from the connecting node to the leaves C A-B 5 9 31 D E 4 6 ABC-E A-2239.5 D-41.5 E-

7 Now we are in thee trivial case of 3 sequences B A a b c a = 29.5 b = 10 c = 12 CDE b is not just for that segment, it represents the complete distance from the connecting node to the leaves C 5 9 20 D E 4 6 ABC-E A-2239.5 D-41.5 E- A B 10 12 This time we got the perfect tree. However, this is not always the case. The algorithm should be repeated with different initial pairings (who are A and B) and then compare the difference between the actual and predicted distnaces (from summing the length of the branches)

8 Neighbour Joining Algorithm Similar to Fitch-Margoliash except that sequences are paired based on the effect of the pairing on the sum of the branch lengths of the tree. The general Neighbour Joining algorithm can be downloaded from ftp.virginia.edu/pub/fasta/GNJ

9 The Algorithm 1.The distances between pair of objects are used to calculate the sum of the branch length for a tree that has no preferred pairing of sequences.

10 2.Decompose the star-like tree by combining pairs of sequences. Using the same example as before this gives:

11 3.Each possible sequence pair is chosen and the sum of the branch lengths of the corresponding tree is calculated. For the example: S_AB=67.7, S_BC=81, S_CD=76, S_DE=70 plus six other possibilities. 4.Choose the one with the lowest sum, in this case S_AB. 5.Once the choice is made calculate the brachn lengths a,b and the average distance from AB to CDE using FM method: 1.a = [d_AB + (d_AC+d_AD+d_AE)/3 – (d_BC+d_BD+d_DE)/3]/2 = (22 + 39.7 -41.7)/2 = 10 2.b = [ d_AB +(d_BC+d_BD+d_BE)/3 – (d_AC+d_AD+d_AE)/3]/2 = (22 + 41.7 +39.7)/2 = 12

12 6.Like in Fitch-Margoliash method: A new distance table with A and B forming a single composite sequence is produced and the algorithm is iterated from the beginning to find the next sequence pair and the next branch lengths.

13 Sources of information So far, all methods shown computed the distance matrix between species from a set of aligned sequences (DNA or Protein) There are many more sources of information – Complete genomes – Restriction sites – Non-coding DNA regions

14 Tree of life Tree of life constructed from all species for which their complete genome has been sequenced

15 There are several methods to compute phylogenetic trees, and sources of information Need to be familiar with several of them to appreciate their differences There are various guiding mechanisms to choose how to build the trees based on likelihood functions and information theory Get familiar with Phylip package as it is a standard onePhylip Other programs existexist Summary


Download ppt "Fitch-Margoliash Algorithm 1.From the distance matrix find the closest pair, e.g., A & B 2.Treat the rest of the sequences as a single composite sequence."

Similar presentations


Ads by Google