Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.

Slides:



Advertisements
Similar presentations
Analysis of Microarray Genomic Data of Breast Cancer Patients Hui Liu, MS candidate Department of statistics Prof. Eric Suess, faculty mentor Department.
Advertisements

Microarray technology and analysis of gene expression data Hillevi Lindroos.
Microarray Data Analysis Stuart M. Brown NYU School of Medicine.
Gene Expression Chapter 9.
Introduction to DNA Microarray Technology Steen Knudsen, April 2005.
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Microarray Data Preprocessing and Clustering Analysis
Figure 1: (A) A microarray may contain thousands of ‘spots’. Each spot contains many copies of the same DNA sequence that uniquely represents a gene from.
Microarrays Technology behind microarrays Data analysis approaches
Computational Biology, Part 12 Expression array cluster analysis Robert F. Murphy, Shann-Ching Chen Copyright  All rights reserved.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Introduction to Bioinformatics - Tutorial no. 12
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Alternative Splicing As an introduction to microarrays.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
Introduce to Microarray
Cluster Analysis Hierarchical and k-means. Expression data Expression data are typically analyzed in matrix form with each row representing a gene and.
Lecture 09 Clustering-based Learning
Clustering and MDS Exploratory Data Analysis. Outline What may be hoped for by clustering What may be hoped for by clustering Representing differences.
Analysis of microarray data
Microarray Preprocessing
Introduction to Microarray Analysis
with an emphasis on DNA microarrays
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
Affymetrix vs. glass slide based arrays
COT 6930 HPC & Bioinformatics Microarray Data Analysis
Analysis and Management of Microarray Data Dr G. P. S. Raghava.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Data Type 1: Microarrays
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Agenda Introduction to microarrays
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Microarrays.
Microarray - Leukemia vs. normal GeneChip System.
Scenario 6 Distinguishing different types of leukemia to target treatment.
ARK-Genomics: Centre for Comparative and Functional Genomics in Farm Animals Richard Talbot Roslin Institute and R(D)SVS University of Edinburgh Microarrays.
CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Procedure Characteristics of Data Data.
Microarray data analysis David A. McClellan, Ph.D. Introduction to Bioinformatics Brigham Young University Dept. Integrative Biology.
A Short Overview of Microarrays Tex Thompson Spring 2005.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
What Is Microarray A new powerful technology for biological exploration Parallel High-throughput Large-scale Genomic scale.
1 FINAL PROJECT- Key dates –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
Quantitative analysis of 2D gels Generalities. Applications Mutant / wild type Physiological conditions Tissue specific expression Disease / normal state.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
MICROARRAY TECHNOLOGY
Course Work Project Project title “Data Analysis Methods for Microarray Based Gene Expression Analysis” Sushil Kumar Singh (batch ) IBAB, Bangalore.
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Microarray Technology. Introduction Introduction –Microarrays are extremely powerful ways to analyze gene expression. –Using a microarray, it is possible.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Gene expression & Clustering. Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species –Dynamic.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
Microarray Data Analysis The Bioinformatics side of the bench.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Microarray: An Introduction
Computational Biology
Microarray - Leukemia vs. normal GeneChip System.
Microarray Technology and Applications
The Basics of Microarray Image Processing
Microarray Data Analysis
Data Type 1: Microarrays
Presentation transcript:

Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran

MBG404 Overview Data Generation Processing Storage Mining Pipelining Microarray

DNA Microarrays consist of million DNA probes attached to a surface of 1 cm by 1 cm (chip). By hybridisation, they can detect DNA or RNA: If the hybridised DNA or RNA is labelled fluorescently it can be quantified by scanning of the chip.

DNA microarrays can be manufactured by: Photolitography (Affymetrix, Febit, Nimblegen) Inkjet (Agilent, Canon) Robot spotting (many providers)

Affymetrix photolitography Each probe 25 bp long probes per gene Perfect Match (PM) as well as MisMatch (MM) probes

Febit/NimbleGen photolitography

Robot Spotting

InkJet (HP/Canon) technology

Summary

Image Analysis 1.Gridding: identify spots (automatic, semiautomatic, manual) 2.Segmentation: separate spots from background. Fixed circle (B), Adaptive circle C, Adaptive shape (D), Histogram 3.Intensity extraction: mean or median of pixels in spot 4.Background correction: local or global

Microarray analysis – Data Preprocessing Objective –Convert image of thousands of signals to a a signal value for each gene or probe set Multiple step –Image analysis –Background and noise subtraction –Normalization –Expression value for a gene or probe set Image analysis and bkg, noise usually done by proprietary software Gene 1100 Gene 2150 Gene 375. Gene

Normalization Corrects for variation in hybridization etc Assumption that no global change in gene expression Without normalization –Intensity value for gene will be lower on Chip B –Many genes will appear to be downregulated when in reality they are not Gene Gene Gene Gene TreatedControl

Data Analysis Part 2- Data analysis –Class discovery –Class comparison –Class prediction –Biological annotation –Pathway analysis

Class Discovery Objective? –Can data tell us which classes are similar? –Are there subgroups? –Do T-ALL, T-LL, B-ALL fall into distinct groups? Methods –Hierarchical clustering –K-means, SOM etc –These are Unsupervised Methods Class Ids are not known to the algorithm –For example, does not know which one is cancer or non cancer –Do the expression values differentiate, does it discover new classes

Hierarchical Clustering Eisen Cluster and Treeview Import data Filter –Filter or not to filter, %P calls, SD etc Accept filter Adjust data –Log transform (important), center, normalize Clustering –Cluster array or genes –Gene computationally intensive –Choose distance metric.cdt file created –Open with Treeview

Experimental Design – Very important!!! Sample size –How many samples in test and control Will depend on many factors such as whether tissue culture or tissue sample Power analysis Replicates –Technical v biological Biological replicates is more important for more heterogenous samples Need replicates for statistical analysis To pool or not to pool –Depends on objective Sample acquistion or extraction –Laser captered or gross dissected All experimental steps from sample acquisition to hybridization –Microarray experiments are very expensive. So, plan experiments carefully

Venn Diagram

Conclusion Other analysis –Class prediction –Gene list from class comparison can be used in pathway analysis –HSLS pathway workshops on Ingenuity, DAVID, Pathway Architect –Future: Integrate expression data with other data such as snp or microRNA GEO has some data analysis features

End Theory I 5 min Mindmapping 10 min Break

Theory II

Microarrays Gene Expression: –We see difference between cells because of differential gene expression, –Gene is expressed by transcribing DNA intosingle-stranded mRNA, –mRNA is later translated into a protein, –Microarrays measure the level of mRNA expression

Microarrays Gene Expression: –mRNA expression represents dynamic aspects of cell, –mRNA is isolated and labeled using a fluorescent material, –mRNA is hybridized to the target; level of hybridization corresponds to light emission which is measured with a laser

Microarrays

Processing Microarray Data Differentiating gene expression: –R = G  not differentiated –R > G  up-regulated –R < G  down regulated

Processing Microarray Data Problems: –Extract data from microarrays, –Analyze the meaning of the multiple arrays.

Processing Microarray Data

Problems: –Extract data from microarrays, –Analyze the meaning of the multiple arrays.

Processing Microarray Data Microarray data:

Processing Microarray Data Clustering: –Find classes in the data, –Identify new classes, –Identify gene correlations, –Methods: K-means clustering, Hierarchical clustering, Self Organizing Maps (SOM)

Processing Microarray Data Distance Measures: –Euclidean Distance: –Manhattan Distance:

Processing Microarray Data K-means Clustering: –Break the data into K clusters, –Start with random partitioning, –Improve it by iterating.

Processing Microarray Data Agglomerative Hierarchical Clustering:

Processing Microarray Data Self-Organizing Feature Maps: –by Teuvo Kohonen, –a data visualization technique which helps to understand high dimensional data by reducing the dimensions of data to a map.

Processing Microarray Data Self-Organizing Feature Maps: –humans simply cannot visualize high dimensional data as is, –SOM help us understand this high dimensional data.

Processing Microarray Data Self-Organizing Feature Maps: –Based on competitive learning, –SOM helps us by producing a map of usually 1 or 2 dimensions, –SOM plot the similarities of the data by grouping –similar data items together.

Processing Microarray Data Self-Organizing Feature Maps:

Processing Microarray Data Self-Organizing Feature Maps: Input vector, synaptic weight vector x = [x 1, x 2, …, x m ] T w j =[w j1, w j2, …, w jm ] T, j = 1, 2,3, l Best matching, winning neuron i(x) = arg min ||x-w j ||, j =1,2,3,..,l Weights w i are updated.

Figure 2. Output map containing the distributions of genes from the alpha30 database. Chavez-Alvarez R, Chavoya A, Mendez-Vazquez A (2014) Discovery of Possible Gene Relationships through the Application of Self-Organizing Maps to DNA Microarray Databases. PLoS ONE 9(4): e doi: /journal.pone

Figure 5. Color-coded output maps representing the final weight of neurons from the samples of the alpha30 database. Chavez-Alvarez R, Chavoya A, Mendez-Vazquez A (2014) Discovery of Possible Gene Relationships through the Application of Self-Organizing Maps to DNA Microarray Databases. PLoS ONE 9(4): e doi: /journal.pone

End Theory II 5 min mindmapping 10 min break

Practice I

Microarray visprog/dicty/dictyExample.htmhttp:// visprog/dicty/dictyExample.htm Use the data as described on the page