Design of oligonucleotides for microarrays and perspectives for design of multi-transcriptome arrays Henrik Bjorn Nielsen, Rasmus Wernersson and Steen.

Slides:



Advertisements
Similar presentations
Recombinant DNA Technology
Advertisements

PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
Modeling sequence dependence of microarray probe signals Li Zhang Department of Biostatistics and Applied Mathematics MD Anderson Cancer Center.
A new method of finding similarity regions in DNA sequences Laurent Noé Gregory Kucherov LORIA/UHP Nancy, France LORIA/INRIA Nancy, France Corresponding.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Probe design for microarrays using OligoWiz Rasmus Wernersson, Assistant Professor Center for Biological Sequence Analysis Technical University of Denmark.
Probe design for microarrays using OligoWiz. Sample Preparation Hybridization Array design Probe design Question Experimental Design Buy Chip/Array Statistical.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
How to use the web for bioinformatics Molecular Technologies February 11, 2005 Ethan Strauss X 1373
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Selection of Optimal DNA Oligos for Gene Expression Arrays Reporter : Wei-Ting Liu Date : Nov
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
Similar Sequence Similar Function Charles Yan Spring 2006.
Accurate Method for Fast Design of Diagnostic Oligonucleotide Probe Sets for DNA Microarrays Nazif Cihan Tas CMSC 838 Presentation.
Genomics and bioinformatics summary 1. Gene finding: computer searches, cDNAs, ESTs, 2.Microarrays 3.Use BLAST to find homologous sequences 4.Multiple.
Introduce to Microarray
How to use the web for bioinformatics Ethan Strauss X 1171
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
©2003/04 Alessandro Bogliolo Primer design. ©2003/04 Alessandro Bogliolo Outline 1.Polymerase Chain Reaction 2.Primer design.
Microarrays: Basic Principle AGCCTAGCCT ACCGAACCGA GCGGAGCGGA CCGGACCGGA TCGGATCGGA Probe Targets Highly parallel molecular search and sort process based.
with an emphasis on DNA microarrays
PCR Primer Design Guidelines
Nucleic Acid Secondarily Structure AND Primer Selection Bioinformatics
International Livestock Research Institute, Nairobi, Kenya. Introduction to Bioinformatics: NOV David Lynn (M.Sc., Ph.D.) Trinity College Dublin.
1 EE381V: Genomic Signal Processing Lecture #13. 2 The Course So Far Gene finding DNA Genome assembly Regulatory motif discovery Comparative genomics.
BLAST What it does and what it means Steven Slater Adapted from pt.
Development and Evaluation of a Comprehensive Functional Gene array for Environmental Studies Zhili He 1,2, C. W. Schadt 2, T. Gentry 2, J. Liebich 3,
Microarray Technology
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
A new way of seeing genomes Combining sequence- and signal-based genome analyses Maik Friedel, Thomas Wilhelm, Jürgen Sühnel FLI Introduction: So far,
Copyright OpenHelix. No use or reproduction without express written consent1.
Hugh E. Williams and Justin Zobel IEEE Transactions on knowledge and data engineering Vol. 14, No. 1, January/February 2002 Presented by Jitimon Keinduangjun.
An Empirical Study of Choosing Efficient Discriminative Seeds for Oligonucleotide Design Won-Hyong Chung and Seong-Bae Park Dept. of Computer Engineering.
Searching Molecular Databases with BLAST. Basic Local Alignment Search Tool How BLAST works Interpreting search results The NCBI Web BLAST interface Demonstration.
PCR provides a forensics tool for identifying colonies
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
PreDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Department.
HMMs for alignments & Sequence pattern discovery I519 Introduction to Bioinformatics.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
CHAPTER SIX Nucleic acid hybridization: principles and applications 생물정보학협동과정 강민호.
BioInformatics Database of Primer Results In order to help predict the way proteins will act in an organism, biologists cross-examine sequences of amino.
Gene Expression and Networks. 2 Microarray Analysis Supervised Methods -Analysis of variance -Discriminate analysis -Support Vector Machine (SVM) Unsupervised.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Pairwise Local Alignment and Database Search Csc 487/687 Computing for Bioinformatics.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
A Software Tool for Generating Non-Crosshybridizing libraries of DNA Oligonucleotides Russell Deaton, junghuei Chen, hong Bi, and John A. Rose Summerized.
From Smith-Waterman to BLAST
Microarray Data Analysis The Bioinformatics side of the bench.
Construction of Substitution matrices
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Copyright OpenHelix. No use or reproduction without express written consent1.
Online Counseling Resource YCMOU ELearning Drive… School of Architecture, Science and Technology Yashwantrao Chavan Maharashtra Open University, Nashik.
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
Protein Tertiary Structure Prediction Structural Bioinformatics.
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
D. Darban, Ph.D Department of Microbiology School of Medicine Alborz University of Medical Sciences 1 Probe and Primer Design.
Selection of Oligonucleotide Probes for Protein Coding Sequences
Lecture 4: Probe & primer design
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Basic Local Alignment Search Tool
Fitness measures for DNA Computing
Basic Local Alignment Search Tool (BLAST)
Russell Deaton, junghuei Chen, hong Bi, and John A. Rose
Applying principles of computer science in a biological context
Presentation transcript:

Design of oligonucleotides for microarrays and perspectives for design of multi-transcriptome arrays Henrik Bjorn Nielsen, Rasmus Wernersson and Steen Knudsen Nucleic Acids Research, 2003, Vol. 31, No –3496 Speaker: Chui-Wei Wong Advisor: 薛 佑 玲, PhD Institute of Biomedical Science Institute of Biomedical Science

2 Outlines Introduction Introduction Method Method Designing Oligonucleotides Designing Oligonucleotides Result Result Discussion Discussion

3 Introduction Center for Biological Sequence Analysis --CBS Center for Biological Sequence Analysis --CBS Technical University of Denmark Technical University of Denmark Conducts basic research in the field of bioinformatics and systems biology Conducts basic research in the field of bioinformatics and systems biology research groups research groups research groups research groups –molecular biologists –biochemists –medical doctors –physicists –computer scientists

4 Oligonucleotides of 20 – 70 bp Oligonucleotides of 20 – 70 bp OligoWiz OligoWiz Evaluate and graphical Evaluate and graphical Input sequences according to collection of parameter Input sequences according to collection of parameter Can detect transcripts from multiple organisms Can detect transcripts from multiple organisms Introduction

5 OligoWiz is implemented as a client – server solution OligoWiz is implemented as a client – server solution Server is responsible for the calculation of the scores Server is responsible for the calculation of the scores Freely available Freely available OligoWiz web page: OligoWiz web page: Introduction

6 Method Written in Java Written in Java MacOS X, Linux and Window MacOS X, Linux and Window Server Server –developed on SGI Unix system –written in Per15 Utilizes the BLAST program for homology database Utilizes the BLAST program for homology database Pallelized using the Perl module ChildManager Pallelized using the Perl module ChildManager

7

8 Download Java

9

10 Designing Oligonucleotides Cross-hybridization Cross-hybridization △ Tm Position within transcript Low-complexity filtering GATC-only score

11 Cross-hybridization To avoid cross-hybridization Affinity difference between the intended target and all other targets should ideally be maximized Experimental evidence suggests that a significantly false signal can be detected –if a 50 bp oligonucleotide has >75 – 80% of the bases complementary –if continuous stretches of >15 bp are complementary to a false target

12 homology score m be the number of BLAST hits considered in position i of the oligonucleotide h { h 1 i,..., h mi } be the BLAST hits in position i L is the length of the oligonucleotide BLAST hit along the full length of the oligonucleotide will get a –score of 0 = 100% identity –score of 1 = 0% identity (no homology)

13 △ Tm Oligonucleotides to discriminate between the targets, the hybridization and washing conditions need to be optimal Oligonucleotides perform well under similar hybridization conditions Melting temperature of the DNA: DNA duplex (Tm) is a good description of an oligonucleotide hybridization property Minimal difference between the Tm of the oligonucleotides

14 △ Tm OligoWiz uses a nearest-neighbor model for Tm estimation:  △ H is the enthalpy  △ S is the entropy change of the nucleation reaction  A is a constant correcting for helix initiation (-10.8)  R is the universal gas constant (1.99 calK -1 mol -1 )  Ct is the total molar concentration of strands  Since the total molar concentration of strands is unknown for most microarray experiments, OligoWiz uses a constant of 2.5x M

15 Based on the Tm estimation a △ Tm score is calculated O Tm by default is the mean Tm of all oligonucleotides in all input sequences of aim length (user specified) or a specific user specified optimal Tm For each 50 position along the input sequence the oligonucleotide length (extending toward the 3 ’ end) with the best △ Tm score is chosen Therefore the △ Tm score is the first calculation the OligoWiz server performs △ Tm

16 Minimal Tm △ Tm

17 Position within transcript Position within the target transcript can be of importance The reverse transcriptase will fall off the transcript with a certain probability Further away from the starting point the less signal will be generated

18 Briefing in bioinformatics. Vol 2. No Dec 2001

19 If the labeling commences from the 3 ’ end (poly A tail) the following score is used: – –dp is the probability that the reverse transcriptase will fall off its template at any given base – – △ 3 ’ end is the oligonucleotide distance to the 3 ’ end of the input sequence Position within transcript

20 In cases where the labeling is done with random primers, as would be the case under prokaryote mRNA labeling, the chance of having an oligonucleotide upstream of a given position should be accounted for: c is a constant indicating the probability that a random primer will bind at any given position Position within transcript

21 To avoid oligonucleotides composed of very common sequence fragments in probe design a low-complexity score was implemented Different sequences are common in different species – –to estimate a low-complexity measure for an oligonucleotide a list of sequence subfragments – –the information content is generated specifically for each species Low-complexity filtering

22 Low-complexity filtering The information content can be calculated by the following equation :  n(w) is the number of occurrences of a pattern in the transcriptome  l(w) the pattern length  nt is the total number of patterns found of a given length

23 OligoWiz uses this list to calculate a low-complexity score for each oligonucleotide:   L is the length of the oligonucleotide   wi is the pattern in position i   norm is a function that normalizes the summed information to a value between 1 and 0 Low-complexity filtering

24 A low-complexity score : – –0 : an oligonucleotide with very low complexity – –Between 1 and 0.8 : majority of oligonucleotides have a low-complexity Low-complexity filtering

25 GATC-only score To allow for filtering out sequence containing ambiguity annotation OligoWiz has a score called ‘ GATC-only ’ Oligonucleotides containing – –R, Y, M, K, X, S, W, H, B, V, D, N or anything else will be given a score of 0 – –G, A, T and C will be assigned a score of 1

26 M = AC R = AG W = AT S = CG Y = CT K = GT V = ACG H = ACT D = AGT B = CGT X = AGCT  Beside A, C, T, G GATC-only score

27 Result 6600 genes annotated in the Saccharomyces cerevisiae genome Oligonucleotides : length interval 45 – 55 bp The homology search and complexity score was based on whole genome databases Mean Tm of the oligonucleotides was 75.7 ℃ calculations done in just 20 min

28

29 Score parameter/info

30

31 1. Graphs represent scores (y-axis) along the input sequence (x-axis). 2. Total (weighted) score 3. Oligonucleotide selected/predicted 4. Sequence of the oligonucleotide selected 5. Score function manipulation interface 6. Sequence info field 7. Iinput sequence table 8. Total score function manipulation interface 9. Applies score weights of the selected entry to all the entries 10. Predicted/custom bottom 11. W-score is the total weighted score for the selected oligonucleotide 12. “ Oligos" per entry field

32

33

Thanks You!!!