Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probe design for microarrays using OligoWiz. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical.

Similar presentations


Presentation on theme: "Probe design for microarrays using OligoWiz. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical."— Presentation transcript:

1 Probe design for microarrays using OligoWiz

2 Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical Analysis Fit to Model (time series) Expression Index Calculation Advanced Data Analysis ClusteringPCAClassification Promoter Analysis Meta analysisSurvival analysisRegulatory Network Comparable Gene Expression Data Normalization Image analysis The DNA Array Analysis Pipeline

3 Probe design -What is a Probe -Different Probe Types -OligoWiz -Probe Design -Cross Hybridization and Complexity -Affinity -Position for microarrays

4 An Ideal Probe - Discriminate well between its intended target and all other targets in the target pool - Detect concentration differences under the hybridization conditions must

5 comparisons AdvantagesDisadvantages PCR products Inexpensive to setupHandling problems No probe selection Uneven probe concentrations Spotted Oligos Allows for probe selection Easy to handle Expensive in small scale In situ synthesized oligonucleotide arrays Allows for probe selection Fast to setup Multiple probes per gene Expensive in large scale Probe Type

6 Custom Microarrays When on virgin ground Some technologies available for custom arrays Spotted arrays in situ synthesized NimbleExpress ェ Array Program

7 OligoWiz a Tool for flexible probe design

8 How does it work? Probe selection 1.Optimal melting temperature (Tm) for the DNA:DNA or RNA:DNA hybridization for probes of the given length is determined. 2.Optimal probe length are determined for all possible probes along the input sequence 3.Five scores are calculated for each of these probes 4.Best probes are selected based on a weighted sum of these scores

9 The five scores In order of importance Cross-hybridization ∆Tm - (deviation from optimal Tm) Folding - (probe self annealing) Position - (3’ preference) Low-complexity All scores are normalize to a value between 0.0 (bad) and 1.0 (best).

10 How to Avoid From Kane et al. (2000) we learn that a 50’mer probe can detect significant false signal from a target that has >75-80% homology to a 50’mer oligo or a continuous stretch of >15 complementary bases If we have substantial sequence information on the given organism, we can try to avoid this by choosing oligos that are not similar to any other expressed sequences. cross-hybridization

11 Hughes et al. 2001 Probe Specificity

12 Mapping Regions 5’ BLAST hits >75% & longer than 15bp 3’ The Sequence we want to design a probe for 50 bp Regions suitable for probes without similarity to other transcripts

13 BLAST hits >75% & longer than 15bp 5’ 3’ Sequence identical or very similar to the query sequence Therefore no BLAST hits with homology > 97% and with a ‘hit length vs. query length’ ratio > 0.8, are considered. 50 bp Filtering Self Detecting BLAST hits out The Sequence we want to design a oligo for

14 Only BLAST hits that passed filtering are considered If m is the number of BLAST hits considered in position i. Let h=(h1 i,...,hm i ) be the BLAST hits in position i in the oligo Where n is the length of the oligo Cross-hybridization Oligo BLAST hits { Max hit in pos. i 100% 0 expressed as a score

15 Similar Affinity Another way of ensuring a optimal discrimination between target and non-target under hybridization is to design all the oligos on an array with similar affinity for their targets. This will allow the experimentalist to optimize the hybridization conditions for all oligos by choosing the right hybridization temperature and salt concentration. Commonly Melting Temperature (Tm) is used as a measure for DNA:DNA or RNA:DNA hybrid affinity. for all oligos

16 Where  H (Kcal/mol) is the sum of the nearest neighbor enthalpy, A is a constant for helix initiation corrections,  S is the sum of the nearest neighbor entropy changes, R is the Gas Constant (1.987 cal deg-1 mol-1) and Ct is the total molar concentration of strands. Where N is all oligos in all sequences. Melting Temperature difference

17 Tm distributions for 30’mers and 50’mers

18  Tm Distribution for probe length intervals

19 Avoid self annealing oligos Probes that form strong hybrids with it self i.e. probes that fold should be avoided. But, accurate folding algorithms like the one employed by mFOLD or RNAfold, is too time consuming, for large scale folding of oligos. Sensitivity may be influenced Time consumption: mFOLD ~2 sec / 30’mer Pr. gene (500bp) ~16 min.

20 Folding an oligonucleotide AT TG CT........................................................................................CG GT TT AT TG CT.........................................................................................CG GT TT............................. Minimal loop size border Dynamic programming: alignment to inverted self The alignment is based on dinucleotides { { { {{{ Substitution matrix is based on binding energies an approximation

21 Folding a lot of oligos AT TG CT........................................................................................CG GT TT AT TG CT.........................................................................................CG GT TT Dynamic programming calculation for second etc. probe Full dynamic programming calculation for first probe Super-alignment matrix.................................................... Minimal loop size border Last probe...... a fast heuristic implementation

22 Reasonably folding prediction compared to mFOLD

23 Probes With Very Common Oligo with low-complexity: AAAAAAAGGAGTTTTTTTTCAAAAAACTTTTTAAAAAAGCTTTAGGTTTTTA (Human) Oligo without low-complexity: CGTGACTGACAGCTGACTGCTAGCCATGCAACGTCATAGTACGATGACT (Human) sub sequences may result in unspecific signal If the sub-fractions of an oligo are very common we define it as ‘low-complex’

24 Where norm is a function that normalizes to between 1 and 0, L is the length of the oligo and W i is the pattern in position i. expressed as a score For a given transcriptome a list of information content from all ‘words’ with length wl (8bp) is calculated: Where f(w) is the number of occurrences of a pattern and tf(w) is the total number of patterns of length wl. A low-complexity score for a given oligo is defined as: Low-complexity = 1-norm Low-complexity

25 Location of Oligo within transcript Labeling include reverse transcription of the mRNA and is sensitive to: - RNA degradation - Premature termination of cDNA synthesis - Premature termination of cRNA transcription (IVT) Eukaryote Position Score: 3’ preference Prokaryote Position score Preference toward 3’, but avoid ~50 most 3’ bases Typically eukaryote sample labeling is done by poly-T and Bacterial samples by random labeling

26 Species databases For 398 species are currently available The species databases are built from complete genomic sequences or UniGene collections in the case of Vertebrates. The databases are used for: Cross hybridization Low-complexity

27 Sequence Features -Special purpose arrays -Example: Detecting Differential splicing Intron/Exon structure, UTR regions etc. Exon Intron Exon

28 Annotation String Single letter code. Sequence:ATGTCTACATATGAAGGTATGTAA Annotation:(EEEEEEEEEEEEEE)DIIIIIII E: Exon I: Intron (: Start of exon ): End of exon D: Donor site A: Accepter site - single letter code

29 Probe placement using Regular Expressions search in annotation

30 Total score cut-off Region include Region exclude Oligo include Oligo exclude Combined filter Filters and Total score values as seen by the placement algorithm Combined filter & score The use of filters Probe placement algorithm

31 Extracting annotation -FeatureExtract server -www.cbs.dtu.dk/services/FeatureExtractwww.cbs.dtu.dk/services/FeatureExtract from GenBank files

32 Exercise Running OligoWiz 2.0 Java 1.4.1 or better is required Input data Sequence only (FASTA) Sequence and annotation Rule-based placement of multiple probes Distance criteria Annotation criteria Please go to the exercise web-page linked from the course program


Download ppt "Probe design for microarrays using OligoWiz. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical."

Similar presentations


Ads by Google