Presentation is loading. Please wait.

Presentation is loading. Please wait.

cDNA Sequencing, SAGE and Microarray Analysis

Similar presentations

Presentation on theme: "cDNA Sequencing, SAGE and Microarray Analysis"— Presentation transcript:

1 cDNA Sequencing, SAGE and Microarray Analysis

2 Outline Overview of transcription Construction of cDNA libraries
cDNA sequencing Expression analysis via SAGE Microarray construction and their use in expression analysis.

3 We can isolate mRNA and convert it to a stable form (cDNA)
cDNA’s Isolate, Reverse Transcribe, label We can isolate mRNA and convert it to a stable form (cDNA) The “Central Dogma” of Molecular Biology DNA mRNA protein

4 Genome in numbers Nucleic acid content of an average human cell
Abundance distribution of mRNA species in a typical mammalian cell

5 Isolation of mRNA

6 cDNA library construction The big picture

7 cDNA synthesis cDNA synthesis occurs in 5’ to 3’ direction, requires:
– a template – nucleosides (dNTPs) – reverse transcriptase (retroviral polymerase) – a primer to initiate synthesis

8 Priming alternatives for cDNA construction

9 Cloning: Blunt end vs. sticky

10 cDNA library construction

11 Ligation of cDNA into vector

12 cDNA library Ideally containing at least one copy of every expressed gene Probablity for the above is a function of: fragment size – the longer the more likely to find gene represented genome size – smaller genome = increased chance to find gene represented expression – high expression = high likelihood to find gene represented For 99% probability, a mammalian cDNA library requires to contain ~800,000 clones

13 cDNA sequencing The advent of cDNA cloning combine with the creating of automated sequencers led to efforts to sequence the entire human transcriptome and to create arrays (on filters) of cDNAs (see reading materials). cDNA sequencing was viewed as the fastest way to get at the coding portion of the genome. Numerous companies sprung up to sequence and patent cDNA’s. cDNA sequencing was also used to measure gene expression levels.

14 cDNA sequencing --> expression analysis
Expression level estimates: Count the number of occurrences of a given cDNA sequence in a given library - highly expressed genes will have been sequenced more often. Use the above (in combination with the total number of sequences in the library) to estimate expression level.

15 Ex. PEDB (

16 Web based expression analysis -
counting cDNA frequency

17 Serial Analysis of Gene Expression (SAGE)
Concept cDNA sequencing is expensive Can uniquely identify most mRNA species by a short sequence in a defined location in the gene (9bp tags are unique 95% of the time) If we could produce a library of short sequences and ligate them together, then we could sequence the ligated DNA to measure the concentration of gene more efficiently

18 SAGE diagram Sequence these -> Linker:
Primer A/B - TypeII site – Type I site A B B A B A Primer A Primer B Primer A Primer B Sequence these ->

19 Issues with SAGE (and cDNA sequencing for expression analysis)
Low abundance clones SAGE in 1995, the estimate was that characterization of genes representing <100mRNA’s/cell would take a few months of work to quantify by a single in investigator (maybe 10 times quicker today) Cost - if we assume even a low estimate of $6/sequencing reaction, 96 lanes * 4 runs/day*30 days * $6 = $69,000 to measure 460,000 tags (assume 40 tags/run). cDNA sequencing Same problem costs/time maybe times higher Hence expression information about low abundance clones is not accurate in cDNA or SAGE data in most cases. Leading to the advent of arrays…..

20 DNA Hybridization

21 Taking advantage of DNA hybridization
On the surface In solution After Hybridization 4 copies of gene A, 1copy of gene B A B A B

22 DNA Arrays Spots of DNA arranged in a particular spatial arangement on a solid support Supports - Filters(nylon, nitrocellulose), glass, silicon Types Spotted or placed - pre-synthesized DNA put onto a surface Synthesized - DNA synthesized directly on the surface

23 The Original DNA Array Petri dish with bacterial colonies
Apply membrane and lift to make a filter containing DNA from each clone. Probe and image to identify Clones homologous to the probe.

24 Vicki - A manual Gridding tool
Gridding tool modifications by : Michèl Schummer

25 Vicki and the gridding frame
Frame Design by: Michèl Schummer

26 Robotic Spotters for Filters

27 Types of filter based arrays
PCR products - ORFs or cDNAs Oligos - some times but generally not used for short products - oligos do not immobilize well on membranes Living clones Place membrane on Whatman paper soaked in media, can grow colonies directly on the arrays Lysis of the colonies followed by cross-linking produced DNA arrays Good for screening large libraries

28 Uses for Filter Based Arrays
In general, filter based arrays were in vogue about 8-13 years ago in the pre-genomic days. Typically cDNA libraries were spotted as clones and the arrays were used to perform comparative expression analysis. Detection was typically performed with radioactive labeling/film or phosphorimaging. “Interesting clones” were identified (via differential expression) and then sequenced. For genomes that have not yet been sequenced, this can still be a cost effective approach, but rapid sequencing is changing that.

29 Selected cDNA arrays With unselected cDNA libraries, clones for highly expressed genes are over represented on the arrays. As time progressed a large number of cDNA’s were sequenced and hence it became possible selected unique cDNA’s and to make arrays on which each spot represented a single gene. Around the same time, coatings for glass were developed that retained spotted DNA well. This allowed for arrays to be produced on glass microscope slides which in turn allowed for fluorescence based detection technology.

30 Typical Path for cDNA clone acquisition
Unigene Image Consortium +others Sequence cDNA’s Reduce redundancy sequences Gen- Bank clones Livermore, ATCC Sequence verified sets Commercial distributors Res. Genetics, InCyte, others Unigene sets Sequence checking Us

31 Spotted Arrays Spotting “pen” Reactive surface or coated surface
Drop containing DNA in solution C A G T C A G T Reactive surface or coated surface

32 MD GenIII Arrayer Plate hotel holds twelve 384-well plates
Gridding head, 12 pins Slide holder 36 slides Features: 36 slides in 8 hours 7680 genes spotted in duplicate Built-in humidity control

33 Scan Cell Population #1 Cell Population #2
Glass slides enabled fluorescent detection in 2(or more) colors Extract mRNA Extract mRNA Make cDNA Label w/ Green Fluor Make cDNA Label w/ Red Fluor Scan Co-hybridize ………………………. ………………………. ………………………. ………………………. ………………………. ………………………. Slide with DNA from different genes

34 Spotted arrays Initially, most spotted arrays were produced by spotting PCR products produced from selected cDNA clones. Issues Must have the libraries in hand Must not mix clones up Must perform high throughput PCR to produce DNA to spot (again without mixing things up). LOTS of freezer space to store everything cDNA’s are long and cross hybridization is a problem (although it is possible to spot oligo’s) Quality manufacturing is difficult to maintain.

35 Oligo Arrays Synthesized or spotted arrays of short oligos of chosen sequence. (typically base pairs) Synthesis methods - ink jet, light directed. Spotting using reactive coupling. Used for re-sequencing, genotyping, diagnostics and expression arrays. MUCH better than cDNA arrays to distinguish related sequences Only have to store the DNA’s OR (better yet) if you synthesize DNA directly on the surface, you only need to store the sequence information (and a few reagents)

36 Basic Oligo Synthesis + + + Coupling Remove Protecting Group
Base Base Base + Coupling Base P P Glass Support Glass Support Remove Protecting Group The protecting group in the standard chemistry is a DMT and is removable.convertible to OH by treatment with acid. The surface attachment is necessary to allow unreacted reagents to be washed away prior to removing the next protecting group. When you order oligos, this is ho they are synthesized and the surface is glass beads that are retained in a column. Base Base Base + Add Next Nucleotide Base Base P + P P Glass Support Glass Support

37 Ink-jets Can be Used to Direct Small Volumes of Liquids to Specific Sites

38 Agilent InkJet Array Technology
Resistor Off Liquid Vaporizes Gas Expands Resistor On Drop Breaks Off Reservoir Refills Fill < 1 msec ~ 44,000 Features on 1”x3” Slide Inkjet microarray manufacturing process which uses standard high efficiency phosphoramidite chemistry provides Agilent precision printing with Uniform feature morphology. This flexibility and precision allows us to significantly increase the number of probes printed on a standard microarray. We have increased our layout from 22K to 44K. If, instead of using ink, one fills the reservoirs with different nucleotides, inkjets can be used to make DNA on a surface

39 Glass Can be Treated to Produce Hydrophilic “Wells”
By pre-treating the surface to create hydrophilic regions surrounded by a hydrophobic surface, small reaction “wells” can be created. This lowers the requirement for alignment of the inkjet nozzles and also helps to confine the droplets applied to the surface.

40 Agilent Printing Facility
Same chemistry as on ABI synthesizer happens in tiny little droplets inkjetted onto the arrays to build up the oligos one base at a time. We have quite a lot of flexibility on the number of features and arrays per slide. This photo shows the latest format we’ve developed which contains 44K features on 1x3” slide.

41 Light-directed oligo synthesis
The key to this synthesis method was the development (by Steve Fodor) of photolabile protecting groups. This development allows for photolithographic technologies to be applied to chemical synthesis.

42 Number of different DNA sequences as a function of photolithographic resolution

43 All possible oligos can be made in 4*N steps

44 Affymetrix Platform Each gene is represented by 11 probe pairs of 25 bp oligos Each probe pair contains a perfect match and a mismatch to the gene sequence Target sample is labeled with a biotinylated nucleotide and detected via a streptavidin-phycoerythrin conjugate One sample per array, one-color data

45 Affymetrix Expression Data
Data from the 11 probe pairs are used to calculated an aggregate signal for each gene

46 Strategies For Array Design
Known Exons Unknown transcript Surrogate Strategy Most expression arrays to date Annotation Strategy Exon arrays Splice variants Shift in thinking about how to interrogate the genome. Chip capacity allows for unbiased approach. This also compares and contrasts previous strategies (expression arrays) with currently possible approaches (Exon and tiling). Only an unbiased approach as used in tiling arrays investigates previously unannotated regions of the genome and therefore enables new discovery of binding sites or transcribed regions. This is a paradigm shift of how people look at the genome. Previously one had to rely on predictions and annotations. Tiling strategy Unbiased look at the genome

47 Affymetrix Platform Expression arrays Exon arrays Mapping arrays
Human, Mouse, Rat, Yeast, E. coli, Drosophila, C. elegans, Dog, Soybean, Plasmodium, Anopheles, Pseudomonas, Arabidopsis, Zebrafish, Xenopus, etc. Exon arrays Alternative splicing patterns Mapping arrays SNP analysis, loss of heterozygosity Tiling array sets Transcript mapping Custom arrays

48 Issues with synthesized oligos
Repetitive yield - e.g. for each reaction cycle, what percentage of the oligos react as intended - estimated at 95% for light directed method, 98-99% for ink jet method (0.95)20 = 35.8%, (0.98)20 = 67% - net result- Affy arrays are usually 25-mers, ink jet arrays are usually 60mers. For a single oligo, it can be shown that sensitivity plateaus at 50-70bp.

49 Relative merits of different methods of making oligo arrays
Affy: available first, large catalogue, small feature size possible Inkjet: much more flexible to design Spotted: less practical for large numbers (>a few 100) of oligo’s, can be made with std. spotting equipment. Libraries of oligos exist for more common organisms, so oligo deposition is feasible for some organisms.

50 Illumina’s Bead Arrays
ACGTGTCTACAGT Step 1 - synthesize beads in batches each batch with a sequence on it. Generally, color code the beads to keep track of which one has what molecule on it. TGCATCAGTGCA CGTGTATGCATGT TGCATCAGTGCA ATGCACTGTAGT Step 2 - Etch the ends of optical fibers in a bundle or circular spots on a glass slide to create bead sized depressions.

51 Illumina’s Bead Arrays (cont)
ACGTGTCTACAGT Step 3 - Allow beads to self assemble an array on the end of the fibers or on the surface TGCATCAGTGCA TGCATCAGTGCA CGTGTATGCATGT ATGCACTGTAGT These self assembled arrays can be used for the same applications as other DNA arrays. Since the assembly is random, one must over represent each desired oligo 10’s of times to assure that each oligo is represented at least n times on the array. Decoding can also be accomplished by hybridizing short labeled oligos to the oligos on each bead. In practice, this is how it is usually done. We will discuss Illumina technology in more detail later in the course in the context of genotyping. See

52 Detection technologies
Radio labeled probes Film or phosphorimagers Biotin labled Post hyb with SA labeled with a fluor or an enzyme Fluorescent probes confocal scanning

53 Scanning with a confocal microscope
The dichroic mirror preferentially reflects laser light but allows the fluorescence to pass through. By using pin-hole near the detector, the depth of field is controlled (e.g. the detected fluorescence comes from very near the surface of the slide.

54 Expression Array Analysis

55 Prepare Fluorescently to identify patterns of gene expression
2- color Microarray Overview Measure Fluorescence in 2 channels red/green Prepare Fluorescently Labeled Probes Control Test Hybridize, Wash Analyze the data to identify patterns of gene expression Slide from John Quackenbush, Dana Farber

56 Prepare Fluorescently to identify patterns of gene expression
1-color Microarray Overview Measure Fluorescence in 1 channel Weed Control Hybridize, Wash Test Prepare Fluorescently Labeled Probes Analyze the data to identify patterns of gene expression Bush Slide adapted from John Quackenbush, Dana Farber

57 2-color vs. single color 2-color was originally designed due to problems in making reproducible arrays - e.g. the ratio on a spot is more reproducible than the absolute intensity if the spot size/concentration changes from array-to-array. With 2-colors, you don’t necessarily get twice as much data since it is typically to run an extra array in the inverted color scheme. Experimental design and cross experiment comparisons are much more complicated with 2-color arrays.

58 Expression Arrays are a Natural Extension of Genomic Analysis
Genome studies provide the source material for the arrays - eg. clones or manufactured DNA’s. For completely sequenced genomes, arrays allow a comprehensive survey of gene expression. This level of analysis is a revolution in biology.

59 Expression Arrays Have a Broad Range of Applicability
Cancer Studies - tumor vs. normal. Infectious disease studies - host response infection, infectious agent gene expression, viral diversity. Pharmaceutical studies - drug treated vs. non-treated. Environmental - microbial diversity, effects of toxins, effect of growth conditions.

60 Expression Arrays Have a Broad Range of Applicability
Gene specific studies - deletion (“knockout”) vs. normal, over expression vs. normal. Agricultural studies - effects of pesticides, growth conditions, hormones. Developmental biology - cells from different areas/stages of developing organisms Many others - any two samples of interest can be compared.

61 Challenges for Planning Good Array Experiments
Experimental Design Replicates are necessary and expensive A simple experiment may not give a simple answer What comparisons should be made? Data Analysis How will differentially expressed genes be identified? How will errors be estimated? What software does this best? How will the data be mined?

62 Where are arrays going? As sequencing gets cheaper and cheaper, most assays that are currently done by arrays can be done more effectively by sequencing. Hence, the analytical use of arrays will be replaced by sequencing. However, arrays can also be used to enrich for specific genomic regions upstream of sequencing or can be used to create many sequences for the artificial production of genomes or genomic regions.

Download ppt "cDNA Sequencing, SAGE and Microarray Analysis"

Similar presentations

Ads by Google