Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mapping Sites of Transcription Across the Drosophila Genome Using High Resolution Tiling Microarrays LBNL, Berkeley CA August 20, 2007 A. WillinghamAffymetrix,

Similar presentations


Presentation on theme: "Mapping Sites of Transcription Across the Drosophila Genome Using High Resolution Tiling Microarrays LBNL, Berkeley CA August 20, 2007 A. WillinghamAffymetrix,"— Presentation transcript:

1 Mapping Sites of Transcription Across the Drosophila Genome Using High Resolution Tiling Microarrays LBNL, Berkeley CA August 20, 2007 A. WillinghamAffymetrix, Inc

2 I. Affymetrix’s Contribution to Specific Aims and Milestones II. Previous Studies Manak et al analysis of developmental transcriptome III. Initial Results for Aim I sample preparation & data processing first look at cell line data on 35bp arrays pilot analysis of brand-new 7bp arrays IV. RACE-array example of ENCODE extension analysis of genes on Chr21 & 22 V. Summary and Steps for Moving Forward Presentation Outline

3 Specific Aim 1 480 samples on 35-bp genome tiling arrays 24 samples on 7-bp genome tiling array sets 160 RACE-fragment pools (16,000 prod’s) Specific Aim 2 RNAi of 120 RNA binding proteins on arrays Specific Aim 3 Northern blotting of ncRNA models

4 RNA Samples and Genome Tiling Arrays

5 Milestones

6 stepwise nature of individual aims & responsibilities involvement & interdependencies of each step propose shifting milestones to more of a “ramp-up” model Timeline for Milestones

7 Previous Studies Manak et al. Nature Genetics, v38 Sep 2006

8 Transcription Analysis of Early (0-24hr) of Drosophila Embryogenesis 70% Annotated 30% Unannotated Manak et al. Nature Genetics, v38 Sep 2006

9 Differential expression in Drosophila embryogenesis (~40kb region of Chromosome 3R) 5’ TSS 0-2 hr 2-4 hr 4-6 hr 6-8 hr 8-10 hr 10-12 hr 12-14 hr 14-16 hr 16-18 hr 18-20 hr 20-22 hr 22-24 hr 19Kb Maternally Expressed Genes (Restarted in two patterns)

10 Unannotated transcription updates known gene annotations Manak et al. Nature Genetics, v38 Sep 2006 Drosophila: 5`-sites predicted by txn co-reg. ~1500 genes avg 1 st intron size = ~20kb avg 1 st annotated intron = ~1.7kb

11 Initial Results of Aim I

12 Affymetrix sample preparation & data generation pipeline sample treatment & QC DNase-treat BioAnalyzer 1 st -strand cDNA synth. random primed Superscript-II 2 nd -strand cDNA synth. DNA Pol-I save aliquot for downstream QC label & hybridize to arrays TdT-based end labeling CEL file generation signal graph generation median-scaling q-norm bioreps select bandwidth transfrag generation select min-run select max-gap data distribution tomeweb hosting FTP to servers? deliver to DCC, GEO, etc this example highlights method for generation of RNA maps but is similar for other applications: RNA maps of long and short RNAs RACE-array maps RNAi knockdown experiments chromatin-immunoprecipitation quality control overlap w/ RACE Northern blots QPCR of cDNA

13 Current Sample Prep (5 cell line samples completed in triplicate) (for 3 other cell lines, several samples failed) Hosted at http://transcriptome.affymetrix.com/download/modENCODE/http://transcriptome.affymetrix.com/download/modENCODE/

14 RNA QC by Agilent BioAnalyzer

15 Chr2L: Transcription Expression Maps Across ~50 Kb ML- DmD4- c1 ML- DmBG 3-c2 Kc167 CME- W1- Cl8

16 Chr2L: Transcription Expression Maps Across ~25 Kb ML- DmD4- c1 ML- DmBG 3-c2 Kc167 CME- W1- Cl8

17 transcription in 4 Drosophila cell lines: overlapping transcription

18 transcription in 4 Drosophila cell lines: overlapping annotation

19 RNA Samples and Genome Tiling Arrays 7 nt resolution arrays new 7bp design 5 arrays, total of ~14.4 million probes by comparison, 35bp array has ~3.1 million probes 5bp design required 7 arrays… 40% more chips required 1512 arrays instead of 1080 replicates & strand not calculated in original budget updated genome version (release 5) used for design repeats can be masked or unmasked virtual probes existing 35-bp design 1 array, total of ~3.1 million probes Affy commercial group will produce an “updated” 2.0 design 39bp resolution, release5-based design however, we will continue using the current design 35bp resolution more optimal for RNA maps 7bp arrays have better coverage & newer design question of $ cost per array? comparison of nucleotide coverage (dm3, release5) 35bp array = 111,117,940 nt 7bp array masked = 107,355,171 nt 7bp array unmasked = 118,523,115 nt

20 Cherbas total RNA samples from 2 cell lines (KC & clone8) Same labeled reactions hyb’d to 35bp and 7bp arrays Signal graphs generated in TAS: 2 technical replicates for each sample were q- norm together Bandwidth = 30 (7bp) or 50 (35bp), Norm target = 200 Transfrags generated in TAS using 5% bacterial negative controls 7bp arrays: min-run 50, max-gap 10 35bp arrays: min-run 50, max-gap 90 Intersections of 7bp vs 35bp and overlap with FlyBase annotations performed in Galaxy Hosted at: http://transcriptome.affymetrix.com/download/modENCODE/pilot_studies/Dros- 7bp-pilot/ http://transcriptome.affymetrix.com/download/modENCODE/pilot_studies/Dros- 7bp-pilot/ Share with modENCODE DCC & ArrayExpress to determine whole- chromosome vs whole-chip data hosting New 7-bp 5-chip array compared to 35-bp 1-chip array

21 Improved exon discrimination by transfrags from 7bp arrays

22 Pseudo-ROC curves comparing base-pair coverage & overlap with annotated exons five different thresholds for calculated probe false-positive rate were used 1%...3%...5%...7%...10% (7% and 10% not shown for 35bp array) 7bp arrays clearly have a significantly lower false-positive rate for forming transfrags from bacterial negative regions ~4-5 fold lower than 35bp arrays attributable to higher probe density and different min-run & max-gap rules

23 35bp and 7bp arrays have similar amount of bp coverage in transfrags BUT 7bp arrays have 50-65% more transfrags 7bp transfrags are more “fragmented” and do a better job of delineating exons with small introns 7-bp array has better “resolution” of small exons Intersection with annotations shows both 35bp and 7bp arrays are detecting similar amounts of transcription as measured by bp coverage Summary: 7bp arrays

24 Improved exon discrimination by transfrags from 7bp arrays

25 modENCODE RACE array methodology 5` RACE for 16,000 Drosophila genes choice of tissues? hybridize products (in pools of 100) to 35bp arrays 1Mb separation between genes confirm presence of transfrags identify new, “rare” transfrags due to amplification of PCR human ENCODE project has done a similar study on the genes present on chromosomes 21 & 22

26 RACE Analysis of Coding Genes DeGeorge Critical Region 14 gene Kapranov, et al. Genome Res. (2005)

27

28

29 Conclusions array types & applications pilot analysis of 7bp arrays updated for dm3-release5 genome annotation: bpmaps & IGB sample processing pipeline & data generation multiple applications require different types of graphs & transfrags bandwidth0 versus smoothing (e.g. bandwidth50) RACE array lessons learned by ENCODE QC and validation some of the specific aims (Northerns, RACE) will address these additional analysis such as RT-PCR and QPCR validation of novel transcripts data hosted at affy-transcriptome website: http://transcriptome.affymetrix.com/download/modENCODE/ sharing pilot data with DCC (Nicole Washington) to facilitate the process Steps Moving Forward adjusting milestones? changes in samples? (usage of 7bp versus 35bp) shifting focus in favor of more analysis of small RNAs? data hosting and transfer issues?

30 Acknowledgements Computation S. Ghosh H. Tammana N. Garg S. Dike J. Cheng Molecular Biology I. Bell J. Drenkow E. Dumais J. Dumais R. Duttagupta P. Kapranov A. Willingham J. Manak AFFX Transcriptome Group Tom Gingeras

31 supplemental slides

32 Kapranov et al. Science, v316 Jun 2007

33

34

35

36

37 same intronic expression seen by all arrays

38 value-of-probe-density

39 value-of-smoothing

40 value-of-unmasked

41 masking-issue-in-exons

42 unmasked regions are frequently higher


Download ppt "Mapping Sites of Transcription Across the Drosophila Genome Using High Resolution Tiling Microarrays LBNL, Berkeley CA August 20, 2007 A. WillinghamAffymetrix,"

Similar presentations


Ads by Google