Presentation is loading. Please wait.

Presentation is loading. Please wait.

The progress of Glossina genomics at RIKEN GSC Todd Taylor RIKEN Genomic Sciences Center, Yokohama, Japan (on behalf of Masahira Hattori)

Similar presentations


Presentation on theme: "The progress of Glossina genomics at RIKEN GSC Todd Taylor RIKEN Genomic Sciences Center, Yokohama, Japan (on behalf of Masahira Hattori)"— Presentation transcript:

1 The progress of Glossina genomics at RIKEN GSC Todd Taylor taylor@gsc.riken.jp RIKEN Genomic Sciences Center, Yokohama, Japan (on behalf of Masahira Hattori) December 15, 2006, IGGI, Sanger, UK

2 Background Sequencing and analysis of human chromosomes 11, 18 and 21 Contributed about 4-5% of human genome sequence Sequencing and analysis of chimpanzee genomic regions including Whole-genome BAC-end sequence analysis Chimpanzee chromosome 22 Found differences (most minor) in nearly all of the coding genes between human and chimp Chimpanzee Y chromosome Development of novel methods for gene and promoter prediction Identifying genes missed by other high-throughput methods Identification of unique regulatory mechanisms

3 Phase III sequence-related activities BAC ends Finished BAC clones Full length cDNAs Whole-genome shotgun

4 BAC end sequencing The first BAC library has been constructed (Yale) and 100,000 BAC end sequences are being produced (RIKEN) Not yet We will be able to sequence the ends of up to 50,000 BACs (100,000 reads) Or possibly more if fosmid ends instead? Can start from April 2007 Will take about one month

5 Finished BAC clone sequencing Five BACs have been fully sequenced (RIKEN) and no serious 'issues' have arisen. VMRC29 library (CHORI) 97H16, 39G22, 36N9, 31O6, 3E11 759,387 bp GC level: 38.89% Repeat content: 6.10% Using the Drosophila fruit fly genus repeat library

6 file name: gmm_clones sequences: 5 total length: 759387 bp GC level: 38.89 % bases masked: 46333 bp ( 6.10 %) ===================================================== number of length percentage elements occupied of sequence ----------------------------------------------------- Retroelements 56 12376 bp 1.63 % SINEs: 0 0 bp 0.00 % Penelope 31 2872 bp 0.38 % LINEs: 49 7695 bp 1.01 % CRE/SLACS 0 0 bp 0.00 % L2/CR1/Rex 7 3181 bp 0.42 % R1/LOA/Jockey 5 1138 bp 0.15 % R2/R4/NeSL 1 51 bp 0.01 % LTR elements: 7 4681 bp 0.62 % BEL/Pao 2 230 bp 0.03 % Gypsy/DIRS1 5 4451 bp 0.59 % DNA transposons 10 4348 bp 0.57 % Tc1-IS630-Pogo 8 2143 bp 0.28 % Other (Mirage, 1 126 bp 0.02 % P-element, Transib) Total interspersed repeats: 16724 bp 2.20 % Small RNA: 3 1357 bp 0.18 % Simple repeats: 237 12658 bp 1.67 % Low complexity: 366 15594 bp 2.05 % The query species was assumed to be "Drosophila fruit fly genus". Homo sapiens ( 4.08 %) Anopheles genus ( 4.52 %) Repeat Masker

7 Full-length cDNA sequencing Full length cDNAs for G. m morsitans (RIKEN) will be constructed and Sanger will perform a few hundred full length sequences on these. RIKEN will do some 5´ end sequencing. Full-length cDNA libraries were prepared by Junichi Watanabe (Univ. Tokyo) Sequencing of 9,462 cDNA clones (5' one pass) was recently completed

8 Whole-genome shotgun sequencing RIKEN has applied to Japanese sources for funding for a further 3 million shotgun sequences (~3X coverage). We failed to get the funding At present, we have no money for WGS or additional BAC finishing Will try for more Japanese-African collaborative projects looking somewhat hopeful

9 Library Sample Information Sequences TC Fat Body/Milk Gland 3,059 GMSG Salivary Gland 7,493 GMREReproductive1,502 GMMMidgut7,015 cDNA Full Length cDNA Sequences 190 TUM/TUF Tsetse Fly Whole Genome cDNA Libraries 9,462 Total Number of Sequences 28,721 Dataset containing ESTs and partial cDNA sequences

10 Strategy and results obtained from preliminary analysis 28,721 sequences were assembled into contigs and identified singletons Total Contigs made=3,857; Total Singletons= 10,213 Translated contigs and singletons into Six Reading Frames Homology searched in SwissProt and NR protein databases Annotated 2,569 ORFs out of 3,857 contigs Annotated 2,783 ORFs out of 10,213 singletons CAP3 3,857contigs30,942ORFsTranseq 10,213singletonsTranseq57,860ORFs 33% sequence identity BLAT Selected continuous ORFs containing atleast 50 amino acids

11 Drosophila (84%) Anopheles (2%) Aedes (3%)Others (6%) Glossina (5%) A large percent of ORFs from TseTse fly contigs resemble those of ‘fruit fly’

12 A large percent of ORFs from TseTse fly Singletons resemble those of ‘fruit fly’ Drosophila (81%) Anopheles (2%) Aedes (5%) Others (9%) Glossina (3%)

13 METABROWSER : a resource to analyse the metagenome GENEPREDICTIONFUNCTIONALANNOTATION Metagenome Analysis PipeLine USER INPUT Genomic Contigs & Sequences Query the Metagnome Data Browser BROWSE ADVANCED ANALYSIS PredictedGenes AnnotatedGenes GLIMMER GENEMARK GETORF CRITICA MetaGene BLAST INTERPROSCAN PLHOST PROSITESCAN COGs Manatee (GO) FingerPRINTscan JAFA ? HT-GO-FAT PubSearch BLIMPS (BLOCKS) Pfam MetabolicPathways ComparativeGenomics PhylogeneticClassification ProteinInteraction EnzymeClassification 16s ribosomal RNA analysis TaxonomicClassification Pathogenicityindex Origin of Replication SecondaryStructurePrediction Fold Prediction OtherAnalysis

14 Metagenome Data Browser : Data from our internal projects METABROWSER : a resource to analyse the metagenome Metagenome Data Browser Data Browser Genes Proteins NovelPathways ComparativeAnalysis Download Sequence NovelGenomes NovelProteins Other Related Information

15 Current & Future Plans Sequencing More if funding allows Analysis We can contribute to the informatics of the Glossina genome, including cDNA analysis and annotation But we don’t want to duplicate anyone’s efforts Also BES mapping and comparative analysis with Drosophila, mosquito, etc. ???

16 Acknowledgements Informatics (RIKEN) Tulika Prakash Srivastava Vineet K. Sharma Todd D. Taylor Sequencing & Data Access Atsushi Toyoda (RIKEN) Junichi Watanabe (Univ. Tokyo) Hiroyuki Wakaguri (Univ. Tokyo) Yamashita (Kitasato Univ.) Serap Aksoy (Yale) Geoff Attardo (Yale) Other Masahira Hattori (Univ. Tokyo/RIKEN) Yoshiyuki Sakaki (RIKEN)


Download ppt "The progress of Glossina genomics at RIKEN GSC Todd Taylor RIKEN Genomic Sciences Center, Yokohama, Japan (on behalf of Masahira Hattori)"

Similar presentations


Ads by Google