Presentation on theme: "Transcriptome analysis using Open Reading frame ESTs (ORESTES) Emmanuel Dias Neto, PhD Lab of Neurosciences, LIM-27 Instituto de Psiquiatria Faculdade."— Presentation transcript:
Transcriptome analysis using Open Reading frame ESTs (ORESTES) Emmanuel Dias Neto, PhD Lab of Neurosciences, LIM-27 Instituto de Psiquiatria Faculdade de Medicina Universidade de Sao Paulo, SP - BRAZIL
UNESCO - First North-South Human Genome Conference Caxambú, MG, Brazil Caxambú, MG, Brazil Is there a way to integrate the research performed in developing countries with the US/Europe ‘Human Genome Project’ ?Is there a way to integrate the research performed in developing countries with the US/Europe ‘Human Genome Project’ ? After the completion of the ‘Human Genome sequencing’, how can we gain access or make use of the technology developed ?After the completion of the ‘Human Genome sequencing’, how can we gain access or make use of the technology developed ?
How can we learn ? Initiate an EST sequencing project of a parasite of local importance (Schistosoma mansoni)Initiate an EST sequencing project of a parasite of local importance (Schistosoma mansoni) cDNA libraries prepared with Marcelo Bento SoarescDNA libraries prepared with Marcelo Bento Soares cDNA sequencing performed at TIGR (Craig Venter)cDNA sequencing performed at TIGR (Craig Venter) Some 1,000 ESTs generatedSome 1,000 ESTs generated
ESTs “ E xpressed S equence T ags” Partial sequences, usually derived from the ends of cDNA molecules. 500 nt 4 Kb 5’3’ Open reading frame (ORF)
Main problems found - Repetitive sequencing of highly expressed genes : high redundancy (~60%)- Repetitive sequencing of highly expressed genes : high redundancy (~60%) - Necessity of large amounts of mRNA in order to obtain a normalized library- Necessity of large amounts of mRNA in order to obtain a normalized library - Reduced information of no matches- Reduced information of no matches
Gene expression in a typical eukaryotic cell ClassAbundantIntermediateRareAbundance/gene12, Diversity< Huang et al., 1999
Alternative protocol to generate ESTs Is there a way to tag rare genes ? How to generate data from small amounts of mRNA ? Is it possible to tag the central portion of the transcripts ?
Ideas The use of a PCR-based strategy, should enable the analysis of small amounts of mRNA. Using randomly selected primers (in RT- PCR) at low stringency as a means to evaluate other regions of the transcripts...
Randomly selected primers ORESTES
Factors that contribute for the presence of a gene in a cDNA library Abundance Nucleotide diversity Usual cDNA libraries ORESTES
ORESTES - the data normalization
Covering a transcript with ORESTES -The amplification of a gene region requires primer binding at both sides of a point. - The chance of a primer binding, depends on the size of the sequences flanking the amplification point. - If the size of a transcript is taken as 1, and the distance of the 3’ end is taken as S: -The probability (P) of an appropriate amplification of a point is P = S(1-S) Coverage of the central point = 0.5(1-0.5) = 0.5x0.5 = 0.25 = 25% Coverage of the last 10% of a transcript = 0.1x0.9 = 0.09 = 9%
Position of matches
ORESTES - sequence distribution
ORESTES - the data Comparison with dbest data
Project Organisation Sequencing Center FM-USP Sequencing Center UNICAMP Sequencing Center EPM FM-USP/RP Sequencing Center IQ-USP Coordination P P P P P PPPPP P P P P P PPPPP P P P P P PPPPP P P P P PPPPPP P P P P P P P PPPPPPP P P P LICR Sequencing Center
Project Organisation Dissected tissue samples Dept. of Pathology Hospital A.C. Camargo RNA coordination LICR/SP Preparation and validation of all mRNAs to be used Library coordination LICR/SP cDNA synthesis and amplification ORESTES production and development ORESTES sequencing
P Fernando Costa (CM) P Sérgio Verjovski(QV) P ChristineHackel P Arthur Gruber P Helaine Carrer/Dirce Carraro P MariCleide Sogayar P MaFátima Sonati P Edna Kimura P GonçaloG. Pereira P HamzaFA El-Dorry P MariaAparecidaNagai (MR) P MarcoAntônio Zago(RC) P Angelita Gama P Enilza Espeáfrico P DanielGianella Neto P Gustavo H Goldman P SuelyKN Marie P MaLuísa Paçó-Larson P Elizabeth Martins Paulo L.Hoo P Vanderlei Rodrigues P Eloiza Tajara P MarceloBriones(PM) P Sandro Valentini P RuiMBMaciel P Luis Eduardo Andrade P IsmaelDG Silva P João Bosco Pesquero P MariaInês Pardini(IL2) P MarinaNóbrega(IL3) P Sílvia Rogatto(IL5)
Using ORESTES to help to define the complete set of genes expressed in different human tissues/tumours
Generation of Colon ESTs HCGP X CGAP = 2,1x more sequences
Generation of Stomach ESTs HCGP X CGAP = 2,5x more sequences
Generation of Breast ESTs HCGP X CGAP = 9,1x more sequences
Generation of Head and Neck ESTs HCGP X CGAP = 34,4x more sequences
Next challenge Data Information
The Head & Neck transcriptome initiative
Transcriptional level Tumor Suppressor genes
- Clusters composed of sequences exclusively derived from normal samples - Clusters mapping to genomic regions of frequent Loss (LOH) in H&N tumours Total = 78 clusters Looking for putative tumour suppressor genes
Transcriptional level Oncogenes
- Clusters composed of sequences exclusively derived from tumour samples - Clusters mapping to genomic regions frequently amplified in H&N tumours Total = 271 clusters Looking for putative oncogenes
Differential gene expression in Larynx tumors
Differential gene expression in Pharynx tumors
Differential gene expression in Oral cavity tumors
HSD TCGTTATGCCAGTGAAAATGTCAACAAATTGTTGGTAGGGAACAAATGTGA RC5-BT A a PM2-BT c c PM2-BT c c MR3-GN e c MR4-ET d c MR4-EN d c IL2-FT C a MR4-ET h a MR0-RT d a CM1-HN c a QV3-BN a a QV3-DT f a QV2-NN d a IL5-UM g … a CM4-HN h a MR0-RT a a MR2-UM g a g PM0-IT e a PM1-MT a * a PM1-MT f a Homo sapiens RAB1, member RAS oncogene family (RAB1), mRNA Type Non-Synonymous Codon aaa-caa Nucleotide A-C Aminoacid K(lysine)-Q(glutanine)
"You have made your way from worm to man but much within you is still worm" (Friedrich Nietzche, Zarathustra's Prologue)
S. japonicum 43,707 ESTs 28,839 adult worms 14,868 eggs
New drugs ??
Trans R Soc Trop Med Hyg Sep-Oct;96(5):465-9.
Acknowledgements Bioinformatics - F. Tsukumo, M. Carazolli and G. Pereira (UNICAMP) - EM Reis, A. Silva, S. Verjovski (IQ/USP) - WA Silva Jr, MA Zago (USP/RP) Clinical Group - André, M. Giuliano, LP Kowalski (H.Câncer) Genomics & Molecular genetics FAD Nunes (FO/USP) MM Brentani, Simone, Fátima, E Miracca, MA Nagai (FM/USP) DN Nunes, C Colin, MH Bengston, K Marsirer, MC Sogayar (IQ/USP) E Kimura, S Leoni (ICB/USP) JM Cerutti, GS Guimarães, R Maciel (UNIFESP), E Tajara, Ulises, P Rahal (UNESP/SJR Preto), S Rogatto, C Rainho (UNESP/Botucatu), S Valentim, José Eduardo, Glória (UNESP/Araraquara) FG Nóbrega, M Nóbrega (UNIVAP) EPB Ojopi, PEM Guimarães (IPq/USP) F Costa, F Lopes (Unicamp) MCR Costa (USP/RP)
Emmanuel Dias Neto, PhD Laboratory of Neurosciences, Institute and Dept. of Psychiatry Faculdade de Medicina, University of São Paulo São Paulo, SP