Presentation is loading. Please wait.

Presentation is loading. Please wait.

shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab shRNA libraries sequencing using DNA Sudoku.

Similar presentations


Presentation on theme: "shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab shRNA libraries sequencing using DNA Sudoku."— Presentation transcript:

1 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab shRNA libraries sequencing using DNA Sudoku

2 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Preparing DNA libraries Programmable microarray Cloning into plasmidsTransformation Array single colonies Introduction Naïve SolutionsChinese PoolingAnalysisResults

3 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku The problem Input: 40,000 bacterial colonies Output: The sequence of the shRNA inserts Insert type Introduction Naïve SolutionsChinese PoolingAnalysisResults

4 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Motivation Filtering the correct fragments Balanced representation Subset selection. Introduction Naïve SolutionsChinese PoolingAnalysisResults

5 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Clone-by-clone sequencing Clone-by-clone sequencing: Sequence each clone by a capillary platform Caveat: Cost: ~40,000$ Conclusion: using next generation sequencing Introduction Naïve Solutions Chinese PoolingAnalysisResults

6 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Naïve next-gen Pooling Solexa ?? Conclusion: we need to add a source clone identifier (barcode) Introduction Naïve Solutions Chinese PoolingAnalysisResults

7 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Naive barcoding Barcoding Pooling Solexa BarcodeSequence 214AGTGC.. 8106CTCAA.. 30010TTTCG.. 88TTGAA.. Caveats: Order 40,000 barcodes. Each of length of ~95nt. 40,000 PCR reactions. Conclusion: we need less barcodes Introduction Naïve Solutions Chinese PoolingAnalysisResults

8 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu Naive Pooling(1) 12345678 A B C D E F GenotypeBarcode ACACA5 B Barcode: Which specimen appears in both barcode #5 and #B? Specimen #13! Case #1: Introduction Naïve Solutions Chinese PoolingAnalysisResults

9 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu Naive Pooling(2) 12345678 A B C D E F Barcode: GenotypeBarcode ACGTT1 D E 2 ACGTT associated with specimens #25(D,1) and #34 (E,2)! Or maybe ACGTT associated with specimens #25(D,2) and #34(E,1)? Ambiguity Conclusion: we should deal with shRNA ‘duplicates’ Case #2: Introduction Naïve Solutions Chinese PoolingAnalysisResults

10 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu Lessons learned for the desired scheme Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple Introduction Naïve Solutions Chinese PoolingAnalysisResults

11 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu Barcoding PE sequencing Decoding Overview of our solution ‘Chinese’ Pooling IntroductionNaïve Solutions Chinese Pooling AnalysisResults

12 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku The pooling design Combinatorial pooling using the Chinese Remainder Theorem (CRT). Combinatorial pooling using the Chinese Remainder Theorem (CRT). "I have never done anything 'useful'. No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world” (G. Hardy, A Mathematician's Apology,1940) IntroductionNaïve Solutions Chinese Pooling AnalysisResults

13 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Chinese remainder riddle “An old woman goes to market and a horse steps on her basket and crashes the eggs. The rider offers to pay for the damages and asks her how many eggs she had brought. She does not remember the exact number, but when she had taken them out 3 at a time, there was one egg left. The same happened when she picked them out 4, and 5 at a time, but when she took them 7 at a time they came out even. What is the smallest number of eggs she could have had?” Answer: 91 eggs Chinese Remainder Theorem says: -There is one-to-one correspondence between n (0  n<2*3*5*7) and the residues. - There is an easy algorithm to solve the equation system. IntroductionNaïve Solutions Chinese Pooling AnalysisResults

14 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Pooling construction with modular equations Specimen Pooling window Destination well (different plates) One-to-One correspondence… IntroductionNaïve Solutions Chinese Pooling AnalysisResults

15 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku 03/06/09erlich@cshl.edu Example of Chinese pooling Source array: IntroductionNaïve Solutions Chinese Pooling AnalysisResults

16 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu Chinese Remainder Theorem asserts: (1) Two specimens will be meet in no more than one pool. (2) The number of pools Inputs: N (number of specimens in the experiment) Weight (pooling efforts) Algorithm: 1. Find W numbers {x 1,x 2,…,x w } such that: (a)Bigger than (b)Pairwise coprime For instance: {5,8,9} but not {5,6,9} 2. Generate W modular equations: 3. Construct the pooling design upon the modular equations Output: Pooling design Chinese Remainder Pooling Design Number of bc: IntroductionNaïve Solutions Chinese Pooling AnalysisResults

17 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu How good is our method? Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple IntroductionNaïve SolutionsChinese Pooling Analysis Results

18 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu Barcode reduction IEEE Transaction on Information Theory (1964) Proved upon pure combinatorial constrains: the lower theoretical bound of the number of barcodes is Our method is very close the lower theoretical bound IntroductionNaïve SolutionsChinese Pooling Analysis Results

19 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu How good is our method? Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple IntroductionNaïve SolutionsChinese Pooling Analysis Results

20 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Dealing with duplicates - simulation erlich@cshl.edu Duplicates size Probability of correct decoding 40,000 specimens with only 384 barcodes 0.99 IntroductionNaïve SolutionsChinese Pooling Analysis Results

21 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu How good is our method? Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple W=5: 5 lanes of Solexa One week and a half of robotics IntroductionNaïve SolutionsChinese Pooling Analysis Results

22 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku erlich@cshl.edu How good is our method? Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple IntroductionNaïve SolutionsChinese Pooling Analysis Results

23 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Real results… Arabidopsis shRNA library with 17,000 shRNA fragments Picked 40,320 bacterial colonies Sequence 3,000 colonies with capillary sequencing for comparison. Decoded ~20,500 bacterial colonies with correct inserts 96% of the assignments were correct. ~8,000 unique fragments of the library. IntroductionNaïve SolutionsChinese PoolingAnalysis Results

24 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku Future directions Developing a more advance decoder using machine learning approach 2-stage algorithm IntroductionNaïve SolutionsChinese PoolingAnalysis Results

25 3/23/09erlich@cshl.eduSequencing shRNA libraries with DNA Sudoku 03/06/09erlich@cshl.edu DNA Sudoku Greg Hannon Acknowledgements Ken Chang Michelle Rooks Assaf Gordon Oron Navon and Roy Ronen


Download ppt "shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab shRNA libraries sequencing using DNA Sudoku."

Similar presentations


Ads by Google