Presentation is loading. Please wait.

Presentation is loading. Please wait.

INDIAN INITIATIVE FOR TOMATO GENOME SEQUENCING Tomato Finishing Workshop T. R. Sharma National Research Centre on Plant Biotechnology Indian Agricultural.

Similar presentations


Presentation on theme: "INDIAN INITIATIVE FOR TOMATO GENOME SEQUENCING Tomato Finishing Workshop T. R. Sharma National Research Centre on Plant Biotechnology Indian Agricultural."— Presentation transcript:

1 INDIAN INITIATIVE FOR TOMATO GENOME SEQUENCING Tomato Finishing Workshop T. R. Sharma National Research Centre on Plant Biotechnology Indian Agricultural Research Institute New Delhi -110012 trsharma@nrcpb.org

2 Tomato Genome Sequencing Project ItalyUSA Spain JapanFrance N.landIndia ChinaUKKorea

3 Capillary Sequencers ABI-3700 MegaBace-1000/4000 Sequence Type Collection of DNA Seq. data

4 Data Flow at NRCPB

5 (i)rename - renames any number of files from ABI or MegaBACE generated format to St. Louis naming convention, (ii)fsplit - splits a file containing multiple sequences in fasta format (iii)fmerge - converts multiple fasta files into a single fasta file (iv)coverage - calculates the depth of coverage of an assembly by the most stringent method (v)extract_reads – extracts all the reads from a particular contig or contigs in an assembly, (vi)comhits - compares two blast outputs stored as text for common hit (vii)confasta - converts a file of nucleotide sequences containing numbers and/or blank spaces into a sequence fasta file for doing BLAST search (viii) format2xls - converts sequence fasta files to a tab delimited format (ix)format2fasta - converts a database stored file into fasta format for further analysis (x)prefinish96 - an excel macro program which arranges templates in alphabetical order along with their custom primers in a 96 well format (xi)prefinish384 - a similar excel macro program for template arrangement in 384 well format Softwares Developed for Performing HTGS Analysis

6 Sequence gap closer strategies for use

7 Gap Single clone area Single strand area Multiple clone coverage on both strands Genome Sequences Types Submitted to GenBank Gap 13 4 2 A BECDHFG Phase I 1342 A BCDEFGH E E Phase II 1 E E Phase III Custom primers

8 DNA Sequence Finishing

9 Finishing DNA Sequences Resolve sequence ambiguities and discrepancies, such that the error rate is less than one in 10,000 bases. Provide “double-stranded” coverage for every base: – minimum of two different clones – two different directions – two different chemistries Achieve contiguity. Delineate vector/insert junctions. Finishing: is the process of polishing raw sequences, transforming the fragmented rough draft into long, continuous final product without breaks or errors. GOALS………..

10 Finishing DNA Sequences -How Scan assembly to pick linker clones for Tn Seq custom oligo dye terminator reverse dye terminator special chem (dGTP) reactions custom oligo for BAC DNA sequencing PCR amplification of problem areas Software used: Consed which is a graphical tool for viewing and editing sequence assembly data : chromat_dir, phd_dir, edit_dir

11 Methods to resolve Seq. Gaps 1.Transposon method Identify linker clones Perform trnasposon insertions Transform DH10B cells Pickup atleast 24 white colonies Prepare template Seq. all the templates Add new Seq. data Linker clones (New England BioLabs)

12 Methods to resolve Seq. problems 2.Custom primer method Design primers Seq. at least 3 shot gun clones spanning to the region With same/different chemistry Add new seq. data - Editing Identify problem areas Poor quality region Custom primer

13 Methods to resolve Seq. problems 3.PCR method Joining 2 contigs by PCR Contig 1Contig 2 PCR amplification Primers Seq. of PCR products Cleaning of PCR products New reads M 1 2 3 4 5 6 7 8 1 kb -

14 Sequencing Status, IITGS Phase 111= 24 Phase 11 =25 Phase1 =10 Library =9 Total BACs Seq.= 68

15 BAC clones in Phase III (IITGS) S. No. Map Position (cM)Acc.#BACMarkerSize (kb) 10AC187148C05HBa0191B01CT10176 2-AC188781C05SLm0005B15-96 3-AC204082C05SLe0086I08-130 47AC187538C05HBa0261K11C2-At1g60200155 510AC188778C05HBa0042B19cLET-8-B23117 6-AC188782C05SLm0037H06-108 7-AC212301C05HBa0060G21CT242131 816AC194694C05HBa0058L13T1592105 9-AC212306C05SLe0066O01TG441100 10-AC209589C05HBa0145P19TG432150 Total Seq.=1.168MB

16 S. No. Map Position (cM)Acc.#BACMarkerSize (kb) 11-AC212299C05HBa0003C20BS419 12-AC209178C05HBa0168M18CT16731 13-AC225119C05HBa0207N03-50 14-AC225041C05SLm0118J18-106 15-AC212305C05SLe0028N03-135 16-AC212309C05SLe0122H05-90 1737AC212304C05HBa0309L13C2_At2g01110130 18-AC225118C05HBa0161A14-96 19-AC212312C05SLm0115G01C2_At1g24830142 20-AC225040C05SLm0079C22T1640125 21-AC225117C05HBa0042L17-72 22119AC186292C05HBa0251J13TG18598 23115AC196190C05HBa0141A12-89 2476AC212274C05HBa0135A02C2Atlg10500100 BAC clones in Phase III (IITGS) Total Seq.=1.283MB

17 BAC clones on other Chromosomes / Redundant BAC Clones S. No.Map Position (cM)Acc.#BACMarker Size (kb) 1-AC187540C07SLm0077G20-92 2-AC187539C07HBa0179K09T0876108 322AC212314C11HBa0027B05BS4168 4-AC212315C11SLe0053P22-140 5111AC182647C05HBa0006N20TG69123 Total Seq.=631kb Total Seq.=3.082MB

18 Examples of Problematic Regions

19 Highly misassembled clone C05SLm0050C14

20 consensus Aligned region showing single base mismatch in C05SLm0050C14

21 Approach to solve the misassembly in C05SLm0050C14 Region yet to be resolved Manually re-arranging reads on basis of: Read-pair information of sub-clones. PCR of different regions within the BAC to reconfirm assembly. Digestion pattern of BAC obtained from six different restriction enzymes. Sequence obtained after assembling individual sub-clones following transposition Current status of C05SLm0050C14

22 Misassembly C05HBa0089M06

23 A typical GC rich region

24 ACKNOWLEDGEMENTS All Members of Indian Tomato Genome Sequencing Group and DBT for Financial Assistance


Download ppt "INDIAN INITIATIVE FOR TOMATO GENOME SEQUENCING Tomato Finishing Workshop T. R. Sharma National Research Centre on Plant Biotechnology Indian Agricultural."

Similar presentations


Ads by Google