Presentation is loading. Please wait.

Presentation is loading. Please wait.

Maximum Parsimony (MP) Algorithm. MP Algorithm  Character-based algorithm – does not use distances, but utilizes the character information in sequences.

Similar presentations


Presentation on theme: "Maximum Parsimony (MP) Algorithm. MP Algorithm  Character-based algorithm – does not use distances, but utilizes the character information in sequences."— Presentation transcript:

1 Maximum Parsimony (MP) Algorithm

2 MP Algorithm  Character-based algorithm – does not use distances, but utilizes the character information in sequences  A criticism of distance-based methods is that they do not exploit the structure of the sequences (collapse them to a number – the distance)  Main philosophy is “economy of substitutions” – find the tree that requires the fewest mutations (maximum parsimony)

3 MP Algorithm  The strategy  explore a number of possible trees  report the tree with smallest score (most parsimonious)  Need to be able to solve two problems  small parsimony problem -- given a candidate tree compute its parsimony score  large parsimony problem -- generate efficiently viable candidate trees (cannot generate all – tree explosion)

4 Small Parsimony Problem  Given a candidate tree, compute its parsimony score  Consider a candidate tree for one-site sequences s1 = A s2 = T s3 = T s4 = G s5 = A A T T G A ATAT AGAG T AGTAGT Final Score = 3

5 Solving Small Parsimony Problem  explore the tree bottom-up (from leaves to interior)  for each internal node one level up  if the “labels” at the two child nodes have no symbols in common assign as label at this node the sum of both labels penalize the tree one unit  if the “labels” at the two child nodes do have symbols in common, label with common portion no penalty AGCAGC AGAG C AGAG GTGT G

6 Solving Small Parsimony Problem  For n-site sequences run the algorithm in parallel for each site and add up the parsimony scores for all sites  Consider a candidate tree for the following sequences s1 = ATC s2 = ACC s3 = GTA s4 = GCA ATC ACC GTA GCA TCTC AC AGAG T ACAC T CTCT A Final Score = 4

7 Solving Large Parsimony Problem  Generate efficiently viable candidate trees (cannot try all)  Branch-and-bound approach  create a possible tree by some method; calculate its score  start building a tree from scratch; discarding trees that cost more than current best

8 Solving Large Parsimony Problem  Branch-and-bound approach http://artedi.ebc.uu.se/course/X3-2004/Phylogeny/Phylogeny-TreeSearch/Phylogeny-Search.html

9 MP Summary  Character-based algorithm – uses the sequence data  Produces unrooted trees  Economy of substitution – best tree is one that requires fewest number of substitutions  Examines a number of possible trees in search for best one


Download ppt "Maximum Parsimony (MP) Algorithm. MP Algorithm  Character-based algorithm – does not use distances, but utilizes the character information in sequences."

Similar presentations


Ads by Google