# Fitch-Margoliash (FM) Algorithm

## Presentation on theme: "Fitch-Margoliash (FM) Algorithm"— Presentation transcript:

Fitch-Margoliash (FM) Algorithm

FM Algorithm Similar to UPGMA but removes molecular clock assumption
(i.e. distance from an internal node to leaves differs) Produces unrooted trees Algorithm (similar to UPGMA) Add a leaf to the tree for each taxon Initially make each taxon be its own cluster Find the closest clusters and connect with node in the tree (place new node at equal distance from the clusters at distance given by 3-point formula) Repeat previous step until all clusters are connected

FM and 3-point formula Given three taxa i, j, k with distances d(i, j), d(i, k), d(j, k) where should the interior node m be placed to connect the taxa and preserve the distances? i j k m

FM and 3-point formula Given three taxa i, j, k with distances d(i, j), d(i, k), d(j, k) where should the interior node m be placed to connect the taxa and preserve the distances? i j k m

FM Algorithm Algorithm (similar to UPGMA)
Add a leaf to the tree for each taxon Initially make each taxon be its own cluster Find the closest clusters and connect with node in the tree (place new at distance given by 3-point formula, where the points are clusters of tax and we use the distance between clusters) Repeat previous step until all clusters are connected x1 x1 x4 x5 x5 x4 x3 x3 x2 x2

Apply the FM algorithm to the following distance matrix:
B C D E A .31 1.01 .75 1.03 - 1.00 .69 .90 .61 .42 .37 A and B are closest; temporarily group C-D-E and compute d(A, B), d(A, C-D-E), d(B, C-D-E) to apply 3-point formula d(A,C-D-E) = 1/3( ) = .93 d(B,C-D-E) = 1/3( ) = .863 d(A, B) = .31 C-D-E .7415 A .1215 .1885 B X only used to help us group A, B By 3-point formula: d(C-D-E,X) = 1/2(d(C-D-E,A) + d(C-D-E,B) – d(A,B)) d(B, X) = 1/2(d(B,A) + d(B,C-D-E) – d(A,C-D-E)) d(A, X) = 1/2(d(A,B) + d(A,C-D-E) – d(B,C-D-E))

The partial tree so far is:
A and B are combined in a cluster for the rest of the algorithm, so need to recompute the distances from A-B to other clusters: d(A-B,C) = 1/2( ) = 1.005 d(A-B,D) = 1/2( ) = .72 d(A-B, E) = 1/2( ) = .965 C D E A-B 1.005 .72 .965 - .61 .42 .37 The updated table is: A .1215 .1885 B The partial tree so far is:

C D E A-B 1.005 .72 .965 - .61 .42 .37 Based on the updated table
D and E are closest; temporarily group A-B-C and compute d(D, E), d(D, A-B-C), d(E, A-B-C) to apply 3-point formula .135 .548 .235 E D A-B-C Y d(D,A-B-C) = 1/3( ) = .683 d(E,A-B-C) = 1/3( ) = .783 d(D, E) = .37 only used to help us group D, E By 3-point formula: d(A-B-C,Y) = 1/2(d(A-B-C, D) + d(A-B-C,E) – d(D,E)) d(D, Y) = 1/2(d(D,E) + d(D,A-B-C) – d(E,A-B-C)) d(E, Y) = 1/2(d(E,D) + d(E,A-B-C) – d(D,A-B-C))

The updated table is now:
D and E are combined in a cluster for the rest of the algorithm, so need to recompute the distances from D-E to other clusters: d(A-B,D-E) = 1/4 ( ) = .8425 d(A-B,C) = 1/2( ) = 1.005 d(C,D-E) = 1/2 ( ) = .515 C D-E A-B 1.005 .8425 - .515 The updated table is now: .135 .235 E D A .1215 .1885 B The partial tree so far is:

Based on the updated table C D-E A-B 1.005 .8425 - .515
There are only three clusters, so just apply the 3-point formula d(A-B,Z) = 1/2(d(A-B, D-E) + d(A-B,C) – d(D-E,C)) d(D-E,Z) = 1/2(d(D-E,A-B) + d(D-E,-C) – d(A-B,C)) d(C, Y) = 1/2(d(C,A-B) + d(C,D-E) – d(A-B,D-E)) A-B .33875 .17625 .66625 C D-E Z

Now we need to expand the clusters A-B, D-E
.33875 C A .1215 .1885 B a .135 .235 E D b Z A-B .33875 .17625 .66625 C D-E Z We also need to compute the values for a and b: d(A-B, Z) = 1/2 (d(A,Z) + d(B, Z)) = 1/2 (.1885+a a) = a = d(D-E, Z) = 1/2 (d(D,Z) + d(E, Z)) = 1/2 (.235+b b) = b = The negative value for b is a cause for concern about the quality of the data. If we are confident of our data and since is close to 0, b would be set to 0.

FM Summary Distance-based algorithm that produces unrooted trees
Removes the assumption of molecular clock, but does not give information about the root (common ancestor) To detect the root could introduce an extra taxon (outgroup) that is more distantly related to the given taxa