3/24/2005 TIGP 1 Bioinformatics for Microarray Studies at IBS Pei-Ing Hwang, Ph.D. Mar. 24, 2005.

Slides:



Advertisements
Similar presentations
Recombinant DNA Technology
Advertisements

BiGCaT Bioinformatics Hunting strategy of the bigcat.
Microarray Data Analysis Day 2
The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.
Abstract BarleyBase ( is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression.
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Gene Expression Chapter 9.
DNA microarray and array data analysis
Microarray analysis Golan Yona ( original version by David Lin )
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
The Human Genome Project and ~ 100 other genome projects:
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Arrays: Narrower terms include bead arrays, bead based arrays, bioarrays, bioelectronic arrays, cDNA arrays, cell arrays, DNA arrays, gene arrays, gene.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Microarrays: Theory and Application By Rich Jenkins MS Student of Zoo4670/5670 Year 2004.
Introduce to Microarray
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Microarrays: Basic Principle AGCCTAGCCT ACCGAACCGA GCGGAGCGGA CCGGACCGGA TCGGATCGGA Probe Targets Highly parallel molecular search and sort process based.
and analysis of gene transcription
Affymetrix vs. glass slide based arrays
Potato Genomics In Fredericton Dr. Barry Flinn Co-Lead Investigator - Genome Atlantic CPGP Research Director - Solanum Genomics International Inc.
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
Review of Ondex Bernice Rogowitz G2P Visualization and Visual Analytics Team March 18, 2010.
歐亞書局 PRINCIPLES OF BIOCHEMISTRY Chapter 9 DNA-Based Information Technologies.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Analysis of Microarray Data 1.Scan the images 2.Quantify intensity of spots 3.Normalization 4.Analysis of data 5.Identification of genes of interest 6.Validation.
Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.
CDNA Microarrays MB206.
Data Type 1: Microarrays
Panu Somervuo, March 19, cDNA microarrays.
Gene expression and DNA microarrays Old methods. New methods based on genome sequence. –DNA Microarrays Reading assignment - handout –Chapter ,
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
Microarray Technology
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
Microarray - Leukemia vs. normal GeneChip System.
ARK-Genomics: Centre for Comparative and Functional Genomics in Farm Animals Richard Talbot Roslin Institute and R(D)SVS University of Edinburgh Microarrays.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Genomics.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
MICROARRAY TECHNOLOGY
Gene Expression Omnibus (GEO)
Lettuce/Sunflower EST CGPDB project. Data analysis, assembly visualization and validation. Alexander Kozik, Brian Chan, Richard Michelmore. Department.
Microarray (Gene Expression) DNA microarrays is a technology that can be used to measure changes in expression levels or to detect SNiPs Microarrays differ.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Applied Bioinformatics Week 9 Jens Allmer. Theory I Gene Expression Microarray.
Lecture 23 – Functional Genomics I Based on chapter 8 Functional and Comparative Genomics Copyright © 2010 Pearson Education Inc.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
Introduction to Oligonucleotide Microarray Technology
Microarray: An Introduction
Detecting DNA with DNA probes arrays. DNA sequences can be detected by DNA probes and arrays (= collection of microscopic DNA spots attached to a solid.
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu.
Microarray - Leukemia vs. normal GeneChip System.
Microarray Technology and Applications
Genomes and Their Evolution
Gene Expression Analysis
Data Type 1: Microarrays
Presentation transcript:

3/24/2005 TIGP 1 Bioinformatics for Microarray Studies at IBS Pei-Ing Hwang, Ph.D. Mar. 24, 2005

TIGP 2 3/24/2005 Different aspects for life science research genomics transcriptomics proteomics

TIGP 3 3/24/2005 Building blocks for DNA or RNA DNA: A, T, G, C DNA: A, T, G, C RNA: A, U, G, C RNA: A, U, G, C

TIGP 4 3/24/2005 DNA: deoxyribonucleic acid Double stranded Antiparallel

TIGP 5 3/24/2005 Why microarray? Gene Expression Gene Expression To simultaneously study multiple genes To simultaneously study multiple genes To obtain an overview of gene expression at transcriptional level under specific experimental conditions To obtain an overview of gene expression at transcriptional level under specific experimental conditions To study gene interaction network from the transcriptional aspect To study gene interaction network from the transcriptional aspect Genome Genome SNP detection SNP detection To find out recombination site in the chromosome/genome To find out recombination site in the chromosome/genome Hopefully to discover the gene responsible for a genetic disease Hopefully to discover the gene responsible for a genetic disease

TIGP 6 3/24/2005 Outline Introduction to Microarray experiments Introduction to Microarray experiments Experiences at IBS for the cDNA arrays Experiences at IBS for the cDNA arrays Data generated with microarray Data generated with microarray DNA annotation DNA annotation Data Analysis Data Analysis Data Management Data Management

TIGP 7 3/24/2005 About Microarray Technology-1 Up to hundreds of thousands of spots in a fixed area on a glass slide or a membrane Up to hundreds of thousands of spots in a fixed area on a glass slide or a membrane One species of DNA molecules per one spot One species of DNA molecules per one spot Spot is also named as “ feature ” Spot is also named as “ feature ” DNA fixed on the chip or membrane is also called “ probe DNA fixed on the chip or membrane is also called “ probe The sequence or/and function of each DNA species on the spot is known. The sequence or/and function of each DNA species on the spot is known.

TIGP 8 3/24/2005 About Microarray Technology-2 Making use of “ hybridization method ” Making use of “ hybridization method ” A : T, U A : T, U G : C G : C Image processing Image processing Data analysis Data analysis Result interpretation from biology aspect Result interpretation from biology aspect

TIGP 9 3/24/2005 Types of Microarray Types of DNA immobilized on the solid support Types of DNA immobilized on the solid support cDNA vs. oligonucleotides cDNA vs. oligonucleotides Manufacturing methods Manufacturing methods Printing vs. photolithography Printing vs. photolithography Solid support Solid support Glass slides Glass slides Membrane Membrane Nucleotide labeling (slide scanning condition) Nucleotide labeling (slide scanning condition) One color vs. two colors One color vs. two colors

TIGP 10 3/24/2005 GeneChip ® Array Manufacuturing Figure 1. Affymetrix uses a unique combination of photolithography and combinatorial chemistry to manufacture GeneChip® Arrays.

TIGP 11 3/24/2005 Microarray printing machine

TIGP 12 3/24/2005 Procedure for one-channel array

TIGP 13 3/24/2005 Experimental Procedure for 2-channel Microarray

TIGP 14 3/24/2005 Data Analyses Feature intensity acquisition Feature intensity acquisition Image analyses: Image analyses: To identify differentially expressed genes Normalization (global, local, print-tip, btwn array etc.) Normalization (global, local, print-tip, btwn array etc.) Clustering or Classification Clustering or Classification Analyses from biology aspect Analyses from biology aspect Significant genes Significant genes Transcriptional regulation study Transcriptional regulation study Cellular pathway or network finding Cellular pathway or network finding

3/24/2005 TIGP 15 Experiences at IBS for the cDNA arrays

TIGP 16 3/24/2005 About IBS tomato arrays ~13000 spots/features per chip ~13000 spots/features per chip 1 clone per spot 1 clone per spot cDNA clones from ~a dozen of various cDNA libraries cDNA clones from ~a dozen of various cDNA libraries At least two different protocols were followed and six different vectors were used At least two different protocols were followed and six different vectors were used More than ten technicians involved More than ten technicians involved

TIGP 17 3/24/2005 Bioinformatics for Microarray at IBS (cont ’ d) IBS tomato EST database construction IBS tomato EST database construction Installation, management and maintenance of data analyses software Installation, management and maintenance of data analyses software Reference information searching Reference information searching Batch Submission of EST sequences Batch Submission of EST sequences

TIGP 18 3/24/2005 Bioinformatics Needs for Microarray Studies at IBS Pre-arraying data management Pre-arraying data management cDNA info collection, vector trimming, sequence annotation, EST submission ……..etc. cDNA info collection, vector trimming, sequence annotation, EST submission ……..etc. Array information management Array information management Gene set characterization, data storage, data retrieval Gene set characterization, data storage, data retrieval Post-hybridization data analysis and management Post-hybridization data analysis and management array data analyses, storage of the scanning result, biology- oriented bioinformatics analyses array data analyses, storage of the scanning result, biology- oriented bioinformatics analyses

TIGP 19 3/24/2005 Bioinformatics Service Work for Microarray studies at IBS Data pre-processing for the cDNAs Data pre-processing for the cDNAs Clone id assignment Clone id assignment Sequence trimming Sequence trimming gene annotation gene annotation Function classification Function classification Data sheet preparation for commercial software to analyze microarray data Data sheet preparation for commercial software to analyze microarray data Gal file preparation for GenePixPro Gal file preparation for GenePixPro Master Gene List preparation for GeneSpring Master Gene List preparation for GeneSpring

TIGP 20 3/24/2005 cDNA clones GenePix Spotfire, GeneSpring Biological meaning : Pathway analysis Transcription network Gene-gene interaction Feature intensities normalization sequencing PCR Vector trimming Assembly Function annotation Database Data analysis: Normalization, Variance Clustering

TIGP 21 3/24/2005 Pre-array Bioinformatics clones from labs sequencing Raw EST seq 1.Clone id generation 2.Vector Trimming 3.Sequence assembly 4.Seq annotation (BLAST) 5.EST submission to NCBI 6.Database construction Data Processing and Management

TIGP 22 3/24/2005 Clone id generation Data centralization following sequencing Data centralization following sequencing Rules for re-arraying Rules for re-arraying 96 well plate to/from 384 well 96 well plate to/from 384 well PCR from 96 well and spotting from 384 well PCR from 96 well and spotting from 384 well Order of A1, A2, B1, B2 Order of A1, A2, B1, B2

TIGP 23 3/24/2005 cDNA clones sequencing PCR 96 or 384 well 96 well 384 well

TIGP 24 3/24/ well to 384 well plates A1 B2 A2 B1

TIGP 25 3/24/2005 Data collection Raw sequencing data obtained from the sequencing company Raw sequencing data obtained from the sequencing company Organized and stored both ABI and text files by labs and by date Organized and stored both ABI and text files by labs and by date Confirmed with each sequence contributor for clone info Confirmed with each sequence contributor for clone info Clone id matched with raw sequences Clone id matched with raw sequences

TIGP 26 3/24/2005 Processing the sequencing data cDNA libraries procedures confirmed with each single lab cDNA libraries procedures confirmed with each single lab Vector/linker/primer trimming (Seqclean) Vector/linker/primer trimming (Seqclean) Function annotation Function annotation Blast against different database Blast against different database Gene Ontology annotation Gene Ontology annotation Sequence Assembly (Phrap) Sequence Assembly (Phrap)

TIGP 27 3/24/2005 Procedure to generate cDNA clones

TIGP 28 3/24/2005 IBS tomato EST Database Cloning information Cloning information Sequencing data Sequencing data Vector/adaptor Trimming information Vector/adaptor Trimming information EST assembly EST assembly Function annotation Function annotation Cross Reference Cross Reference

3/24/2005 TIGP 29 ID MAP 1. Seq id 2. Clone _ id 3. Contig id 4. Lab_id#1 5. Lab_id#2 6. NCBI_sbmt_id93 7. NCBI_sbmt_id94 8. dbEST _ accn _no 9. note Trimmed Sequence 1. Seq id 2. Trimmed Sequence 3. Method 4. Trim set Assembly Information 1. Contig _ id 2. Contig Sequence 3. BLAST Result 4. Position 5. Component seq id TAIR Result 1. Seq id 2. At number 3. E-Value 4. Description 5. Identity 6. Other result NCBI BLAST Result 1. Seq id 2. NCBI _id 3. E-Value 4. Description 5. Identity 6. Other result TIGR Result 1. Seq id 2. TC number 3. E-Value 4. Description 5. Identity 6. Other result Lab info 1. Seq id 2. Comment 3. Primer 4. Biotech 5. Sender 6. Collect From cDNA Library Information 1. Clone _ id(3)(4) 8. Host. 2. Name 9. Species 3. Date made 10. Vector 4. Developmental stage 11. Antibiotic. 5. Cloning sites 12. Authors 6. Description 13. Tissue 7. Library 14. Primer Gene Ontology 1. TC number 2. EC number 3. Process -GO_id -Description 4. Function -GO_id -Description 5. Component -GO_id -Description TC number Untrimmed Sequence 1. Seq id 2. Trimmed Sequence Clone _ id n11n The Tomato Database Entity-Relationship model TOM 3 TOM 4 Clone _ id Seq _ id

TIGP 30 3/24/2005 Information to be further analyzed Gene set characterization Gene set characterization Number of unique genes on the array Number of unique genes on the array Number of known/ unkown genes Number of known/ unkown genes Coordination of each spotted sequence Coordination of each spotted sequence Statistics about spotted cDNA Statistics about spotted cDNA grouped by function/pathway grouped by function/pathway grouped by sequence similarity grouped by sequence similarity

3/24/2005 TIGP 31 Post-hybridization data analysis and management

TIGP 32 3/24/2005 Post-hybridization data analysis Software for Microarray Analysis At IBS Software for Microarray Analysis At IBS GenePix Pro5.0 – image processing GenePix Pro5.0 – image processing GeneSpring – microarray data analysis GeneSpring – microarray data analysis Spotfire – microarray data analysis and data storage Spotfire – microarray data analysis and data storage TransPath – pathway searching TransPath – pathway searching

TIGP 33 3/24/2005 Image Processing GenePix Pro5.0 GenePix Pro5.0 GAL (GenePix Array List) file GAL (GenePix Array List) file

TIGP 34 3/24/2005 From multi-well plate to microarray

TIGP 35 3/24/2005 GAL online

TIGP 36 3/24/2005 GeneSpring at IBS for microarray data analyses for microarray data analyses standalone software standalone software providing statistical methods for data analysis providing statistical methods for data analysis Some bioinformatics Some bioinformatics providing visaulization providing visaulization licensed annually licensed annually rigid format requirement for input data rigid format requirement for input data requiring installation of a master gene list (master table) prior to data analysis requiring installation of a master gene list (master table) prior to data analysis

TIGP 37 3/24/2005 Master table for GeneSpring Master table contains information of Master table contains information of Id Id Source of DNA Source of DNA Gene name Gene name Gene function annotation (from Blast results) Gene function annotation (from Blast results) GO annotation GO annotation Each array needs its own master table Each array needs its own master table Format of master table may vary with different version of the software. Format of master table may vary with different version of the software.

TIGP 38 3/24/2005 To generate master table for GeneSpring Batch blast against three sequence database Batch blast against three sequence database Parsing Blast results Parsing Blast results Incorporating EC number, GO number and other related data from the best BLAST matched results Incorporating EC number, GO number and other related data from the best BLAST matched results Integrate all required data from various files and generate the master table Integrate all required data from various files and generate the master table checking checking

TIGP 39 3/24/2005 Spotfire for microarray data analyses for microarray data analyses server-client software server-client software linked to Oracle database for data storage linked to Oracle database for data storage providing various statistical methods for data analysis providing various statistical methods for data analysis capability in establishing links to more bioinformatics tools capability in establishing links to more bioinformatics tools can record analysis procedure can record analysis procedure more flexible format requirement for input data more flexible format requirement for input data

TIGP 40 3/24/2005 One color array for Arabidopsis Affymetrix ATH1 chip Affymetrix ATH1 chip Annotation information provided by company and available on internet Annotation information provided by company and available on internet

TIGP 41 3/24/2005 Bioinformatics support at Affymetrix

TIGP 42 3/24/2005 Projects for now and the near future Infrastructure build-up Infrastructure build-up Microarray data management system Microarray data management system Platform for Bioinformatics analyses Platform for Bioinformatics analyses Plant Signaling Pathway Database Plant Signaling Pathway Database

TIGP 43 3/24/2005 Team

3/24/2005 TIGP 44 Thank you!