Genome Rearrangements. Basic Biology: DNA Genetic information is stored in deoxyribonucleic acid (DNA) molecules. A single DNA molecule is a sequence.

Slides:



Advertisements
Similar presentations
A Simpler 1.5-Approximation Algorithm for Sorting by Transpositions Tzvika Hartman Weizmann Institute.
Advertisements

Sorting by reversals Bogdan Pasaniuc Dept. of Computer Science & Engineering.
School of CSE, Georgia Tech
Greedy Algorithms CS 466 Saurabh Sinha. A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix of.
Greedy Algorithms CS 6030 by Savitha Parur Venkitachalam.
Gene an d genome duplication Nadia El-Mabrouk Université de Montréal Canada.
Sorting Cancer Karyotypes by Elementary Operations Michal Ozery-Flato and Ron Shamir School of Computer Science, Tel Aviv University.
Bioinformatics Chromosome rearrangements Chromosome and genome comparison versus gene comparison Permutations and breakpoint graphs Transforming Men into.
Greedy Algorithms And Genome Rearrangements
Genome Rearrangements CIS 667 April 13, Genome Rearrangements We have seen how differences in genes at the sequence level can be used to infer evolutionary.
Introduction to Bioinformatics Algorithms Greedy Algorithms And Genome Rearrangements.
Bioinformatics Lecture 2. Bioinformatics: is the computational branch of molecular biology Using the computer software to analyze biological data The.
Of Mice and Men Learning from genome reversal findings Genome Rearrangements in Mammalian Evolution: Lessons From Human and Mouse Genomes and Transforming.
Genome Rearrangements CSCI : Computational Genomics Debra Goldberg
Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S.
Implementation of Planted Motif Search Algorithms PMS1 and PMS2 Clifford Locke BioGrid REU, Summer 2008 Department of Computer Science and Engineering.
5. Lecture WS 2003/04Bioinformatics III1 Genome Rearrangements Compare to other areas in bioinformatics we still know very little about the rearrangement.
Genome Rearrangements, Synteny, and Comparative Mapping CSCI 4830: Algorithms for Molecular Biology Debra S. Goldberg.
1 Genome Rearrangements João Meidanis São Paulo, Brazil December, 2004.
Phylogenetic Tree Construction and Related Problems Bioinformatics.
Recap Don’t forget to – pick a paper and – me See the schedule to see what’s taken –
7-1 Chapter 7 Genome Rearrangement. 7-2 Background In the late 1980‘s Jeffrey Palmer and colleagues discovered a remarkable and novel pattern of evolutionary.
SC.L.16.3 Describe the basic process of DNA replication and how it relates to the transmission and conservation of the genetic information.
RNA Ribonucleic Acid.
1 Physical Mapping --An Algorithm and An Approximation for Hybridization Mapping Shi Chen CSE497 04Mar2004.
Genome Rearrangement By Ghada Badr Part II. 2  Genomes can be modeled by each gene can be assigned a unique number and is exactly found once in the genome.
A Simplified View of DCJ-Indel Distance Phillip Compeau A Simplified View of DCJ- Indel Distance Phillip Compeau University of California-San Diego Department.
Genome Rearrangements …and YOU!! Presented by: Kevin Gaittens.
1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:
Genome Rearrangements Anne Bergeron, Comparative Genomics Laboratory Université du Québec à Montréal Belle marquise, vos beaux yeux me font mourir d'amour.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals.
Greedy Algorithms And Genome Rearrangements An Introduction to Bioinformatics Algorithms (Jones and Pevzner)
Genome Rearrangements [1] Ch Types of Rearrangements Reversal Translocation
Greedy Algorithms And Genome Rearrangements
Chap. 7 Genome Rearrangements Introduction to Computational Molecular Biology Chap ~
Sorting by Cuts, Joins and Whole Chromosome Duplications
DNA alphabet DNA is the principal constituent of the genome. It may be regarded as a complex set of instructions for creating an organism. Four different.
Computational Molecular Biology Introduction and Preliminaries.
Regents Biology Nucleic Acids Information storage.
Greedy Algorithms CS 498 SS Saurabh Sinha. A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix.
Gene: A sequence of nucleotides coding for protein Gene Prediction Problem: Determine the beginning and end positions of genes in a genome Gene Prediction:
DNA, RNA & Protein Synthesis Chapters 12 & 13. The Structure of DNA.
Genome Rearrangement By Ghada Badr Part I.
Introduction to Bioinformatics Algorithms Chapter 5 Greedy Algorithms and Genome Rearrangements By: Hasnaa Imad.
Genome Rearrangements. Turnip vs Cabbage: Look and Taste Different Although cabbages and turnips share a recent common ancestor, they look and taste different.
Genome Rearrangements. Turnip vs Cabbage: Look and Taste Different Although cabbages and turnips share a recent common ancestor, they look and taste different.
Outline Today’s topic: greedy algorithms
1 Genome Rearrangements (Lecture for CS498-CXZ Algorithms in Bioinformatics) Dec. 6, 2005 ChengXiang Zhai Department of Computer Science University of.
Splicing Exons: A Eukaryotic Challenge to Gene Prediction Ian McCoy.
DNA sequences alignment measurement Lecture 13. Introduction Measurement of “strength” alignment Nucleic acid and amino acid substitutions Measurement.
DNA and RNA Structure and Function Chapter 12 DNA DEOXYRIBONUCLEIC ACID Section 12-1.
Lecture 4: Genome Rearrangements. End Sequence Profiling (ESP) C. Collins and S. Volik (UCSF Cancer Center) 1)Pieces of tumor genome: clones ( kb).
Lecture 2: Genome Rearrangements. Outline Cancer Sequencing Transforming Cabbage into Turnip Genome Rearrangements Sorting By Reversals Pancake Flipping.
Original Synteny Vincent Ferretti, Joseph H. Nadeau, David Sankoff, 1996 Presented by: Suzy Sun.
Conservation of Combinatorial Structures in Evolution Scenarios
Genome Rearrangements
CSE 5290: Algorithms for Bioinformatics Fall 2009
Deoxyribonucleic Acid
Greedy (Approximation) Algorithms and Genome Rearrangements
Lecture 3: Genome Rearrangements and Duplications
THE INSTRUCTION MANUEL FOR BUILDING A BODY
CSCI2950-C Lecture 4 Genome Rearrangements
Greedy Algorithms And Genome Rearrangements
A Unifying View of Genome Rearrangement
DNA Vocabulary.
Double Cut and Join with Insertions and Deletions
Greedy Algorithms And Genome Rearrangements
JAKUB KOVÁĆ, ROBERT WARREN, MARÍLIA D.V. BRAGA and JENS STOYE
BC Science Connections 10
Presentation transcript:

Genome Rearrangements

Basic Biology: DNA Genetic information is stored in deoxyribonucleic acid (DNA) molecules. A single DNA molecule is a sequence of nucleotides – adenine (A) – cytosine (C) – guanine (G) – thymine (T) nitrogenous base pentose sugar phosphate Nucleotide DNA molecule

Basic Biology: DNA Paired DNA strands are in reverse complementary orientation. – One in forward, 5’ to 3’ direction – The other in reverse, 3’ to 5’ direction Both strands are complementary. – A pairs with a T – G pairs with a C forward strand reverse strand 5’5’ 3’3’ 3’3’ 5’5’ Image modified with the permission of the National Human Genome Research Institute (NHGRI), artist Darryl Leja.

Basic Biology: Genome The genome is the entire hereditary information of an organism. Genomes are partitioned into chromosomes. A chromosome can be linear (eukaryotes), or circular (prokaryotes). Image modified with the permission of the National Human Genome Research Institute (NHGRI), artist Darryl Leja.

The Human Karyogram Karyotype of a human male. Courtesy: National Human Genome Research Institute

Changes in Genomic Sequences Genomes of different species (even of closely related individuals) differ from one another. These differences are caused by – point mutations, in which only one nucleotide is changed, and – genome rearrangements, where multiple nucleotides are modified.

Point Mutations Insertion…ATGGCG… →…ATGTGCG… Deletion…ATGTGCG…→…ATGGCG… Substitution…ATGTGCG… →…ATGCGCG… …ATG-GCATGTGCGATGTGCG… …ATGTGCATG-GCGATGCGCG… DNA sequence alignment showing matches, mismatches, and insertions/deletions

Genome Rearrangements Reversal Translocation Fission Fusion

Levenshtein’s Edit Distance Let A and B be two sequences (genomes). The minimum number of edit operations that transforms A into B defines the edit distance, d edit, between A and B. Possible edit operations: – point mutations – genome rearrangements

A Word Puzzle To transform a start word into a target word, change, add, or delete characters until the target is reached. Example: start “spices” target “lice”: spices → slices → slice → lice spices → spice→ slice→ lice How many steps do you need to transform – a republican into a democrat? – Google into Yahoo?

Edit Distance Using Point Mutations S1=AGCTT, S2=AGCCTG, S3=ACAG AGCTTAGCTGAGCCTG  d edit (S1,S2) = 2 AGCTTAGCTGAGCAGACAG  d edit (S1,S3) = 2 AGCCTGAGCTGAGCAGACAG  d edit (S2,S3) = 2 TGTG insert C TGTGTATA delete G delete C TATA delete G

Edit Distance and Evolution The edit distance is often used to infer evolutionary relationships. Parsimony assumption: the minimum number of changes reflects the true evolutionary distance Parsimonious phylogeny inferred from edit distances

Levenshtein’s Edit Distance Let A and B be two sequences (genomes). The minimum number of edit operations that transforms A into B defines the edit distance, d edit, between A and B. Possible edit operations: – point mutations – genome rearrangements

Rearrangements and Anagrams An anagram is a rearrangement of a word or phrase into another word or phrase. eleven plus two → twelve plus one forty five → over fifty Please visit the Internet Anagram web server at

Rearrangements and Anagrams Dot plot: “spendit” vs. “stipend” Dot plot: Mouse genome vs. Human genome

Genome Comparison: Human - Mouse Humans and mice have similar genomes, but their genes are in a different order. How many edits (rearrangements) are needed to transform human into mouse? Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Transforming Mice into Humans a) Mouse and human share a common ancestor b) They share the same genes, but in a different order c) A series of rearrangements transforms one genome into the other

History of Chromosome X Rat Consortium, Nature, 2004

Dobzhansky’s Experiment Drosophila melanogaster life cycle taken from FlyMove Giant polytene chromosomes Modified from T.S. Painter, J. Hered. 25:465–476, Harvesting polytene chromosomes taken from BioPix4U

Dobzhansky’s Experiment Standard and Arrowhead arrangements differ by an inversion from segments 70 to 76 Figures taken from Dobzhansky T, Sturtevant AH. Genetics (1938), 23(1): Chromosome 3 of Drosophila pseudoobscura

Dobzhansky’s Experiment Figures taken from Dobzhansky T, Sturtevant AH. Genetics (1938), 23(1): Configurations observed in various inversion heterozygotes

Dobzhansky’s Experiment Figures taken from Dobzhansky T, Sturtevant AH. Genetics (1938), 23(1): Single and Double Inversions Phylogeny for 3 rd chromosome of D. pseudoobscura

Unsigned Reversals , 2, 3, 4, 5, 6, 7, 8, 9, 10 Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Unsigned Reversals , 2, 3, 8, 7, 6, 5, 4, 9, 10 Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Unsigned Reversals and Gene Orders   = r(1,2)   = r(2,5)   =

Reversal Edit Distance Goal: Given two permutations, find the shortest series of reversals that transforms one into another Input: Permutations  and  Output: A series of reversals r 1,…,r t transforming  into  such that t is minimum t - reversal distance between  and  d rev ( ,  ) - smallest possible value of t, given  and  Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Sorting by Reversals Problem Goal: Given a permutation, find a shortest series of reversals that transforms it into the identity permutation (1 2 … n ) Input: Permutation π Output: A series of reversals r 1, …, r t transforming π into the identity permutation such that t is minimum Reversal Distance Problem and Sorting by Reversals Problem are equivalent. Why? Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Algorithm 1: GreedyReversalSort(π) 1 for i  1 to n – 1 2 j  position of element i in π (i.e. π[j]=i) 3 if j≠i 4 π  π r(i, j) 5 output π 6 if π is the identity permutation 7 return Taken from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

GreedyReversalSort is Not Optimal For  = the algorithm needs 5 steps: Step 0: Step 1: i=1; j=2; r(1,2) Step 2: i=2; j=3; r(2,3) Step 3: i=3; j=4; r(3,4) Step 4: i=4; j=5; r(4,5) Step 5: i=5; j=6; r(5,6) However, two reversals are enough: Step 0: Step 1: Step 2:

Adjacencies & Breakpoints An adjacency is a pair of adjacent elements that are consecutive A breakpoint is a pair of adjacent elements that are not consecutive b(  )  is the number of breakpoints in  π = adjacencies breakpoints, b(  )=4 Extend π with π 0 = 0 and π 7 = 7 Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

 One reversal eliminates at most 2 breakpoints.   = b(  ) = 5   = b(   ) = 4   = b(   ) = 2   = b(   ) = 0  This implies: reversal distance ≥ b(  ) / 2 Reversal Distance and Breakpoints Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Strips An interval between two consecutive breakpoints in a permutation is called a strip. – A strip is increasing if its elements increase. – Otherwise, the strip is decreasing – A single-element strip is considered decreasing with exception of the strips [0] and [n+1]. Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Strips and Breakpoints Observation 1: If a permutation contains a decreasing strip, then there exists a reversal that will decrease the number of breakpoints Observation 2: Otherwise, create a decreasing strip by reversing an increasing strip. The number of breakpoints can be reduced in the next step r(3,8) r(6,8) Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Algorithm2: BreakpointReversalSort(π) 1 while b(π) > 0 2 if π has a decreasing strip Choose reversal r that minimizes b(π r) 4 else 5 Choose a reversal r that flips an increasing strip in π 6 π  π r 7 output π 8 return Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

BreakpointReversalSort (BRS) is an approximation algorithm that will not use more than four times the minimum number of reversals. – BRS eliminates at least one breakpoint every two steps: d BRS ≤ 2b(p) steps – An optimal algorithm eliminates at most two breakpoints every step: d OPT  b(p) / 2 steps  Performance guarantee: d BRS / d OPT  [ 2b(p) / (b(p)/2) ] = 4 Performance Guarantee Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Gene Orientation & Genome Representation modified from

Genome Rearrangements

Signed Reversals 5’ ATGCCTGTACTA 3’ 3’ TACGGACATGAT 5’ 5’ ATGTACAGGCTA 3’ 3’ TACATGTCCGAT 5’ Break and Invert Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Signed Reversals , 2, 3, 4, 5, 6, 7, 8, 9, 10 Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Signed Reversals , 2, 3, -8, -7, -6, -5, -4, 9, 10 Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Signed Reversals and Breakpoints , 2, 3, -8, -7, -6, -5, -4, 9, 10 The reversal introduced two breakpoints Taken and modified from An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner

Summary: Complexity Results Sorting by unsigned reversals: – NP-hard – can be approximated within a constant factor Sorting by signed reversals: – can be solved in polynomial time

Web Tools GRIMM Web Server – computes signed and unsigned reversal distances between permutations. Cinteny – a web server for synteny identification and the analysis of genome rearrangement

DCJ Genome Rearrangements The DCJ model uses Double-Cut-and-Join genome rearrangement operations. DCJ operations break and rejoin one or two intergenic regions (possibly on different chromosomes).

Genome Representation In the DCJ model, a genome is grouped into chromosomes (linear/circular). A gene g on the forward strand is represented by [-g,+g] A gene g on the reverse strand is represented by [+g,-g] Telomeres are represented by the special symbol ‘o’. An adjacency (intergenic region) is encoded by the unordered pair of neighboring gene/telomere ends. Example. linear c1=(o o) circular c2=(5 6 7)

DCJ Operations The double-cut-and-join operation “breaks” two adjacencies and rejoins the fragments: {a, b} {c, d} → {a,d} {c,b}, or {a,c} {b,d}. a, b, c, and d represent different (signed) gene ends or telomeres (with ‘+o’ = ‘-o’). A special case occurs for c=d=o: {a,b} {o,o} ↔ {a,o} {b,o}.

Signed reversal of genes 2 and 3

Chromosome Linearization

Weird gen  me transformation

Using Graphs to Sort Genomes Adjacency graph AG(A,B)=(V,E) is a bipartite graph. V contains one vertex for each adjacency of genome A and B. Each gene, g, defines two edges: e 1 connecting the adjacencies with +g of A and B e 2 connecting the adjacencies with –g. genome A: (o o) (5 6 7) genome B: (o o) (o o) Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) DCJ1: {1,2} {-2,-3}  {1,-2} {2,-3} Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) DCJ1: {1,2} {-2,-3}  {1,-2} {2,-3} Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) DCJ1: {1,2} {-2,-3}  {1,-2} {2,-3} DCJ2: {4,o} {7,-5}  {4,-5} {7,o} Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) DCJ1: {1,2} {-2,-3}  {1,-2} {2,-3} DCJ2: {4,o} {7,-5}  {4,-5} {7,o} Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) DCJ1: {1,2} {-2,-3}  {1,-2} {2,-3} DCJ2: {4,o} {7,-5}  {4,-5} {7,o} DCJ3: {3,-4} {o,o}  {3,o} {o,-4} Example:

Using Graphs to Sort Genomes Algorithm 3: DCJSORT(A,B) 1 Generate adjacency graph AG(A, B) of A and B 2 for each adjacency {p, q} with p,q≠o in genome B do 3 let u={p,l} be the vertex of A that contains p 4 let v={q,m} be the vertex of A that contains q 5 if u ≠ v then 6 replace vertices u and v in A by {p,q} and {l,m} 7update edge set 8 end if 9 end for 10 for each telomere {p,o} in B do 11 let u={p,l} be the vertex of A that contains p 12 if l≠o then 13 replace vertex u in A by {p,o} and {o,l} 14update edge set 15end if 16 end for genome A: (o o) (5 6 7) genome B: (o o) (o o) DCJ1: {1,2} {-2,-3}  {1,-2} {2,-3} DCJ2: {4,o} {7,-5}  {4,-5} {7,o} DCJ3: {3,-4} {o,o}  {3,o} {o,-4} Example: A  DCJ1  DCJ2  DCJ3  B

Summary: Complexity Results Sorting by unsigned reversals: – NP-hard – can be approximated within a constant factor Sorting by signed reversals: – can be solved in polynomial time Sorting by DCJ rearrangements: – can be solved in polynomial time

The End

Disclaimer Our presentation is in many parts inspired by the textbook An Introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevzner, by lectures from Anne Bergeron and Julia Mixtacki, as well as many review articles from multiple colleagues.