DNA computing Solving Optimization problems on a DNA computer Ka-Lok Ng Dept. of Bioinformatics Taichung Healthcare and Management University.

Slides:



Advertisements
Similar presentations
NP-Hard Nattee Niparnan.
Advertisements

DNA COMPUTING Deepthi Bollu
Manipulating DNA: tools and techniques
Ashish Gupta Ashish Gupta Unremarkable Problem, Remarkable Technique Operations in a DNA Computer DNA : A Unique Data Structure ! Pros.
Enrique Blanco - imim.es © 2006 Enrique Blanco (2006) A few ideas about DNA computing.
Article for analog vector algebra computation Allen P. Mils Jr, Bernard Yurke, Philip M Platzman.
1 DNA Computing: Concept and Design Ruoya Wang April 21, 2008 MATH 8803 Final presentation.
Montek Singh COMP Nov 15,  Two different technologies ◦ TODAY: DNA as biochemical computer  DNA molecules encode data  enzymes, probes.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 21 Instructor: Paul Beame.
NP-Complete Problems Reading Material: Chapter 10 Sections 1, 2, 3, and 4 only.
The Theory of NP-Completeness
DNA Computing: Mathematics with Molecules Russell Deaton Professor Comp. Sci. & Engr. The University of Arkansas Fayetteville, AR 72701
Presented By:- Anil Kumar MNW-882-2K11
Genomic DNA purification
ZmqqRPISg0g&feature=player_detail page The polymerase chain reaction (PCR)
Polymerase Chain Reaction
Mutation  Is a change in the genetic material.  Structural change in genomic DNA which can be transmitted from cell to it is daughter cell.  Structural.
WORKSHOP (1) Presented by: Afsaneh Bazgir Polymerase Chain Reaction
Advanced Molecular Biological Techniques. Polymerase Chain Reaction animation.
AP Biology Polymerase Chain Reaction March 18, 2014.
EDVOKIT#300: Blue/White Cloning of a DNA Fragment
DNA Computing on Surfaces
Recombinant DNA Technology………..
Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
Beyond Silicon: Tackling the Unsolvable with DNA.
Basic methods in genetics PCR; Polymerase Chain Reaction Restriction enzyme digestions Gel electrophoresis.
Algorithms and Running Time Algorithm: Well defined and finite sequence of steps to solve a well defined problem. Eg.,, Sequence of steps to multiply two.
1 Chapter 2: DNA replication and applications DNA replication in the cell Polymerase chain reaction (PCR) Sequence analysis of DNA.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Review from last week. The Making of a Plasmid Plasmid: - a small circular piece of extra-chromosomal bacterial DNA, able to replicate - bacteria exchange.
Fast parallel molecular solution to the Hitting-set problem Speaker Nung-Yue Shi.
What is DNA Computing? Shin, Soo-Yong Artificial Intelligence Lab.
Molecular Testing and Clinical Diagnosis
A PCR-based Protocol for In Vitro Selection of Non-Crosshybridizing Oligonucleotides R. Deaton, J. Chen, H. Bi, M. Garzon, H. Rubin and D. H. Wood.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
A Chinese Postman Problem Based on DNA Computing Z. Yin, F. Zhang, and J. Xu* J. Chem. Inf. Comput. Sci. 2002, 42, Summarized by Shin, Soo-Yong.
The polymerase chain reaction
6.3 Advanced Molecular Biological Techniques 1. Polymerase chain reaction (PCR) 2. Restriction fragment length polymorphism (RFLP) 3. DNA sequencing.
DNA computing on a chip Mitsunori Ogihara and Animesh Ray Nature, 2000 발표자 : 임예니.
1 Biological Computing – DNA solution Presented by Wooyoung Kim 4/8/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad.
Biology Chapter 9 & Honors Biology Chapter 13 Frontiers Of Biotechnology.
Solution of Satisfiability Problem on a Gel-Based DNA computer Ji Yoon Park Dept. of Biochem Hanyang University.
Towards Autonomous Molecular Computers Towards Autonomous Molecular Computers Masami Hagiya, Proceedings of GP, Nakjung Choi
FOOTHILL HIGH SCHOOL SCIENCE DEPARTMENT Chapter 13 Genetic Engineering Section 13-2 Manipulating DNA.
Introduction to PCR Polymerase Chain Reaction
Some simple basics. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field.
CSE280Stefano/Hossein Project: Primer design for cancer genomics.
COSC 3101A - Design and Analysis of Algorithms 14 NP-Completeness.
The genetic engineers toolkit A brief overview of some of the techniques commonly used.
Rajan sharma.  Polymerase chain reaction Is a in vitro method of enzymatic synthesis of specific DNA sequences.  This method was first time developed.
Molecular Evolutionary Computing (MEC) for Maximum Clique Problems March 9, 2004 Biointelligence Laboratory School of Computer Science and Engineering.
PCR Polymerase chain reaction. PCR is a method of amplifying (=copy) a target sequence of DNA.
ICS 353: Design and Analysis of Algorithms NP-Complete Problems King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Introduction to PCR Polymerase Chain Reaction
Richard Anderson Lecture 26 NP-Completeness
Richard Anderson Lecture 26 NP-Completeness
DNA Solution of the Maximal Clique Problem
PCR TECHNIQUE
Solution of Satisfiability Problem on a Gel-Based DNA computer
BIOTECHNOLOGY BIOTECHNOLOGY: Use of living systems and organisms to develop or make useful products GENETIC ENGINEERING: Process of manipulating genes.
DNA Computing and Molecular Programming
Polymerase Chain Reaction (PCR) technique
ICS 353: Design and Analysis of Algorithms
Richard Anderson Lecture 25 NP-Completeness
Introduction to Bioinformatics II
DNA computing on surfaces
Molecular Computation by DNA Hairpin Formation
DNA Solution of the Maximal Clique Problem
Presentation transcript:

DNA computing Solving Optimization problems on a DNA computer Ka-Lok Ng Dept. of Bioinformatics Taichung Healthcare and Management University

Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective

Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective

Why consider DNA computing ? Essences of computation 1.Massive Parallelism A test tube can contains DNA strands, each reaction take place independently. 2.Number of operations/sec. Silicon-based computer is much better, ~ 10 operations/sec DNA computing needs human interception. 3.Extreme large associative memory Memory density ~ 1 bit/(nm) 3 >> video tape ~ 1 bit/10 12 (nm) 3 Human synapses ~ 10 14, each store a few bits. Associative memory – match a sub-sequence Store000000… … …. Input seq.*1*1……  retrieve & read the 2 nd strand

Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective

Basic Molecular Biology 5’ 3’ A G T C C ……………………. T C A G G …………………… 3’ 5’ The DNA Double Helix Purines(Double ring structure)Adenine, AGuanine, G Pyrimidines (Single ring structure) Thymine, TCytosine, C (A,T) Watson-Crick complement (C,G) pairwise attraction Hydrogen boning

Basic DNA Operations 1.DNA synthesiser – make arbitrary DNA strands, time ~ hours Notation : ATGC = 5’-ATGC-3’ (ATGC) C = 3’-TACG-5’ 2.Hybridization (annealing), by hydrogen boning, time ~ 30 sec. it is a 2 nd order kinetic reaction 3.Denature – by heating till the longest strands unstable, dsDNA  ssDNA 4.Ligation x y x C y C x y x C y C Ligase enzyme

Basic DNA Operations 5.Polymerase Extensions Polymerase enzyme attached to the 3’ end of the promer seq. & construct the x C of the longer seq. 3‘ 5‘ 3‘ Primer 3‘ 5‘ Polymerase enzyme x xCxC

Basic DNA Operations 6. Cut – Type II Restriction enzyme (endonucleases), cut ssDNA or dsDNA strand at a specific sub-seq., usually a 4 to 8 nucleotides seq. NomenclatureSpecific-siteExpected freq. EcoRI5’-GAATTC-3’4 6 = 4096 HaeIII5’-GGCC-3’4 4 = 256 PstI5’-CTGCAG-3’4 6 = 4096 HaeIIIPstIEcoRI GGCCCTGCAGGAATTC CCGGGACGTCCTTAAG Blunt3’-protruding5’-protruding

Basic DNA Operations 7.Merge – combine two or more test tubes of DNA solution 8.Separation by length – Gel Electrophoresis Agarose Gel Electrophoresis (AGE) Long seq. – 300 ~ 50,000 bp, t ~ 5 hours Polyacrylamide Gel Electrophoresis (PAGE) Short seq. – 1 ~ 1000 bp, t ~ 1 hour - gel buffer + gel buffer - + Shortest DNA strands

High Level Manipulations 1.Polymerase Chain Reaction (PCR) – Amplification developed by Kary Mullis (Noble Prize in medicine 1994) Prepare Primers x y z x C y C z C 3‘ 5‘ 3‘ xCxC zCzC zCzC 5‘ 3‘ xCxC 5‘3‘ Template

High Level Manipulations 1.Polymerase Chain Reaction (PCR) Polymerase  dsDNA  melting  polymerase …… Repeat the above two processes  amplification x y z xCxC 3‘ 5‘ polymerase z zCzC 5‘ 3‘ polymerase

High Level Manipulations 2.Separation by sub-sequence – by magnetic bead (affinity purification) 3.Append 3.1 Polymerase 3.2 Ligation s sCsC magnet A primer with attached bead anneal a short seq. s sCsC xy x C y C 3‘ 5‘ 3‘ x y

High Level Manipulations 4.Mark – for separation or operate selectively 4.1 appending a tag seq. 4.2 methylation or (de)phosphorylation 4.3 forming a dsDNA through hybridization or the action of polymerase 5.Unmark – removes the mark on the strand append a tag seq. ssDNA 3‘3‘ 5‘ Methylation or (de)phosphorylation of the 5’ end. Carry out by specific enzymes, it can Stop some restriction enzymes cutting to the Site.

Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective

Molecular Computation - Solved problems Hamiltonian Path Problem 3.1Directed Hamiltonian Path Problem (DHPP) Adleman, Science, 266, 1024 (1994) Problem : Given 7 cities, is there a unique path every cities visit once ? A possible solution : 0  1  2  3  4  5  6 Algorithm : 1.For each vertex V and edge E, create a 20-mer DNA strand V : x 0 y 0, x 1 y 1,……, x 6 y 6 and x 0 c,y 6 c E : y 0 c x 1 c, y 1 c x 2 c,…., y 5 c x 6 c, y 0 c x 3 c, y 0 c x 6 c,….. inout

Molecular Computation - Solved problems Hamiltonian Path Problem Algorithm (DHPP) 2.Hybridization – possibility of forming the following DNA strands Not x 0 begin, y 6 end ( 1  2  3  4  5  6 ) x 0 begin, not y 6 end ( 0  1  2  3  4  5 ) x 0 begin, y 6 end but not visit every cities once ( 0  3  2  3  4  5  6 )  consider to be the noise x 0 begin, y 6 end, and visit every cities once ( 0  1  2  3  4  5  6 )

Molecular Computation - Solved problems Hamiltonian Path Problem Algorithm (DHPP) 3.Separation Select those dsDNA start with x 0 and end with y 6 Amplify – use PCR to amplify the above type of DNA strands 4.Separate out all dsDNA that go through exactly 7 vertices (140-mer), by PAGE, for N<150, d = a – b ln N then amplify by PCR d N a

Molecular Computation - Solved problems Hamiltonian Path Problem Algorithm (DHPP) 5.Separate out all DNA strands go through all 7 cities – by affinity purification Melt the dsDNA strands from above Extract by affinity purification …………. 6.Detect if there are any DNA strands remain, Yes  solution of the DHPP No  No solution of the DHPP y 0 c x 1 c with attached bead y 5 c x 6 c with attached bead

Molecular Computation - Solved problems Hamiltonian Path Problem StepTime Create DNA strands Hybridization~30 sec PCR~ 2 hrs. Gel Electrophoresis~ 5 hrs. (AGE), ~ 1.2 hrs. (PAGE) Affinity Purification7 times ~ 1 hrs. Detect~ sec. Total~ 7 days !!

Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective

Molecular Computation - Solved problems Boolean formula Boolean Formula, B In particular, consider the Conjunctive Normal Form (CNF) B = C 1  C 2  C 3 …  C m where C k = x 1  x 2  x 3 ’… C is called the clause, x is called the literal,  is the logical AND,  is the logical OR and x’ is the negation of x  B = (x 1  x 2 )  ( x 1  x 2  x 3 ’ )  …  C m Satisfiability Problem ( B  True ) Determine a set of of the logical variables (x 1, x 2, x 3 …) such that B  T

Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) Example : B = (x 1  x 2 )  ( x 1 ’  x 2 ’ ) x 1 x 2 x 1  x 2 (x 1  x 2 )  (x 1 ’  x 2 ’) x 1 ’ x 2 ’ x 1 ’  x 2 ’

Molecular Computation - Solved problems Boolean formula 3.2Boolean Formula Lipton, Science, 268, 542 (1995) Encode an n bit binary number by a graph, G n x 1 x 2 x n a 1 a 2 a 3 …………a n a n+1 x 1 ’x 2 ’x n ’ Notation : x=1  True, x’=0  False, vertex a i, Edges E aixi, E aixi’, E xiai+1,E xi’ai+1 a 1 x 1 a 2 x 2 ’a 3 encode binary number 10 In general, graph G n represent {0,1} n X X

Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) 1.Create DNA strands to encode vertices and edges x 1 x 2 x n a 1 a 2 a 3 …………… a n a n+1 x 1 ’ x 2 ’ x n ’ Vertex (3n+1 strands)Edge (4n strands) a 1 E a1x1. a n+1 E a1x1’ x 1 E x1a2 x 1 ’ E x1’a2 p a1 q a1 5‘3‘ p an+1 q an+1 5‘3‘ q a1 c p x1 c 5‘3‘ q a1 c p x1’ c 5‘3‘ q x1 c p a2 c 5‘3‘ q x1’ c p a2 c 5‘3‘ p x1 q x1 5‘3‘ p x1’ q x1’ 5‘3‘

Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) 2.Hybridization A path V-E-V-E-V……..V  denote an n bit binary number Example : a path a 1 x 1 a 2 x 2 ’a 3 denote a 2 bit binary number, 10 p a1 q a1 5‘ p x1 q x1 p x2’ q x2’ p a2 q a2 p a3 q a3 q a1 c p x1 c q x1 c p a2 c q a2 c p x2’ c q x2’ c p a3 c

Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) 3.Extraction Define E(t,i,a) to represent extracting test tube t, where the ith position has Boolean value, a = 0 or 1. OR – are done by using multiple tubes AND – are done by repeated extraction

Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) 3.Extraction Test tubeOperationValue present t0t0 Create DNA strands00, 01, 10, 11 t1t1 E(t 0,1,1)10, 11 t1’t1’Remainder of t 1 00, 01 t2t2 E(t 1 ’,2,1)01 t3t3 Merge, t 1  t 2 10, 11, 01  T t 4 (need to remove 11)E(t 3,1,0)01 t4’t4’Remainder of t 4 10, 11 t5t5 E (t 4 ’,2,0)10 t6t6 Merge, t 4  t 5 01, 10

Molecular Computation - Solved problems Boolean formula Algorithm (Boolean CNF Formula) 1.Create DNA strands to encode all n bit binary number 2.Hybridization 3.Extraction Let t k be the test tube satisfies C 1  C 2  C 3 …  C k and let C k+1 = x a  x a+1  … x m (where x is either 0 or 1), for simplification consider C k+1 = x a  x a+1 E(t k,a,1)  x a =1  T 1a T 1a R  E(T 1a R,a+1,1)  x a+1 =1  T a+1 T 1a  T a+1 satisfies C k+1 E(t k,a,0)  x a =0  T 0a  E(T 0a,a+1,1)  x a+1 =1  T a+1 T 0a R T 0a R  T a+1 satisfies C k+1

Molecular Computation - Solved problems Integer knapsack problem Integer knapsack problem Given a set of integers a i and integer A, does there exist a subset S  {1,…n}, s.t.  i  S ≦ A. 1.To solve this problem, make use of the synthesis, annealing and merging operations. 2.Prepare a starter, S, one strand end is blunt and blocked by 5’- biotinylation and the other end is sticky. 3.Use DNA double strands to encode integers a 1 ….a n. With length proportional to the magnitude and both ends are sticky. B S a1a1 xCxC x xCxC

Molecular Computation - Solved problems Integer knapsack problem 4.Generation of all possible combinations, 2 n, by concatenation of the DNA strands. 5.The final DNA solution consisting of 2 n different DNA double strands; the final answer is to check if the solution containing strands with length equal to A by agarose gel electrophoresis.

Molecular Computation - Solved problems Integer knapsack problem Limitation : This brute-force algorithm has an exponential time- complexity, O(2 n ). The concept of encoding all possible solutions by DNA strands is suffered from the exponential growth in the size of the solution space, for instance, a 70 cities of the DHPP will fit in a milliliter of solution (10 20 DNA strands). Hence, people consider to develop a parallel computation model. Dynamic programming approach 1.Parallelism : parallel algorithm, because of the principle of optimality applied, hence, a DNA computer might be useful for solving large instances of problems. 2.For the integer knapsack problem : the worst-case time complexity is O(minimum(2 n, nA)) [Ref. 1].

Molecular Computation - Solved problems Integer knapsack problem Given a set of integers w 1,w 2 ……w n and W, with the corresponding profit integers p 1,p 2 ……p n, is it exist that a sub-set S  {1,2,….n}, that satisfy  i  S w i ≦ W max and maximize  i  S p i. Dynamic programming Solution Let f i (x) be the optimal solution to the integer knapsack problem, f i (x) = max { f i-1 (x), p n + f n-1 (x-w n ) } where x is the capacity remaining, and f i (x) = 0 for x>0 and f i (x) = ﹣ ∞ for x  0. Notice that f i (x) is an ascending function, i.e. 0=x 1  x 2  …..  x n , f i (x 1 )  f i (x 2 )  …..  f i (x k ) ; f i (x) = ﹣ ∞ , x  x 1 ; f i (x) = f i (x k ) , x  x k ; and f i (x) = f i (x j ) , x j  x  x j+1 。 To solve this problem, we make use of the method suggested by Horowitz etc. [Ref. 2] to compute f i (x j ) for 1  j  k. Let the ordered set S 1 i = { ( P,W ) | ( P - p i+1, W - p i+1 )  S i } to represent f i (x), where P = f i (x j ) , W = x j and S 0 ={ ( 0,0 ) }.

Molecular Computation - Solved problems Integer knapsack problem S i+1 can be computed from S i by first computing S 1 i = { ( P,W ) | ( P - p i+1, W - w i+1 )  S i } where S i+1 = S i ∪ S 1 i. If S i contains ( P j,W j ) and ( P k,W k ) with P j  P k and W j  W k then the pair ( P j,W j ) can be discarded from S i, and this condition is known as the dominance rules. For example, consider the case n=3, (w 1, w 2, w 3 )=(2,3,4), (p 1, p 2, p 3 )=(1,2,5) and W max =6. For this case, we have S 0 ={(0,0)}; S 1 0 ={(1,2)} S 1 ={(0,0),(1,2)}; S 1 1 ={(2,3),(3,5)} S 2 ={(0,0),(1,2), (2,3),(3,5)}; S 1 2 ={(5,4),(6,6),(7,7),(8,9)} S 3 ={(0,0),(1,2), (2,3),(3,5),(5,4),(6,6),(7,7),(8,9)} The pair (3,5) is discarded because of the dominance rules.

Molecular Computation - Solved problems Integer knapsack problem Implementation of dynamic programming Consider the case n = 3, (w 1, w 2, w 3 ) = (2,3,4), (p 1, p 2, p 3 ) = (1,2,5) and W max = 6. DNA OperationTest Tubes, T P and T W S 0 = {(0,0)} CopyS 0 = {(0,0)} Addition : (p,w) = (1,2)S 0 1 = {(1,2)} Merge S 1 = S 0  S 0 1 = {(0,0), (1,2)} CopyS 1 = {(0,0), (1,2)} Addition: (p,w) = (2,3)S 1 1 = {(2,3), (3,5)} Merge S 2 = S 1  S 1 1 = {(0,0), (1,2), (2,3), (3,5)} CopyS 2 = {(0,0), (1,2), (2,3), (3,5)} Addition: (p,w) = (5,4)S 2 1 = {(5,4), (6,6), (7,7), (8,9)} Merge S 3 = S 2  S 2 1 = {(0,0), (1,2), (2,3), (3,5), (5,4), (6,6), (7,7), (8,9)}

Molecular Computation - Solved problems Integer knapsack problem Implementation of dynamic programming Difficulties 1.Do not know how to communicate between DNA strands. This operation is required in order to match P k and W k. 2.Do not know how to compare numbers between DNA strands. This operation is required in order to test the dominance rules.

Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective

Limitations and Errors 1.DNA synthesis ~ 90% efficiency 2.Long strands of DNA decay quickly, is the maximum base length can be kept in vitro without significant breakage. 3.Extraction A good path were lost during extract Take a bad path as if a good one 4.Undesirable hybridization 5.Seq. s could anneal with a similar seq. s c

Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective

Prospective 1.There appears little theoretical difficulty in creating a functional DNA computer. 2.Depend on finding killer applications uniquely suitable for computation by DNA. 3.Improvements in reducing errors and operation costs.