OHRI Bioinformatics Introduction to the Significance Analysis of Microarrays application Stem.

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Genomic Profiles of Brain Tissue in Humans and Chimpanzees II Naomi Altman Oct 06.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
6/10/20151 Microarray Data Analysis. 6/10/20152 Copyright notice Many of the images in this power point presentation of other people. The Copyright belong.
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
A Statistical Framework for the Design of Microarray Experiments and Effective Detection of Differential Gene Expression by Shu-Dong Zhang, Timothy W.
PSY 307 – Statistics for the Behavioral Sciences
Microarray Data Preprocessing and Clustering Analysis
Gene Expression Data Analyses (3)
Differentially expressed genes
Statistical Analysis of Microarray Data
Identification of spatial biases in Affymetrix oligonucleotide microarrays Jose Manuel Arteaga-Salas, Graham J. G. Upton, William B. Langdon and Andrew.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
1 Test of significance for small samples Javier Cabrera.
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
Significance Tests P-values and Q-values. Outline Statistical significance in multiple testing Statistical significance in multiple testing Empirical.
\department of mathematics and computer science Supervised microarray data analysis Mark van de Wiel.
Different Expression Multiple Hypothesis Testing STAT115 Spring 2012.
Microarray Data Analysis Illumina Gene Expression Data Analysis Yun Lian.
Differential Expression and False Discovery Rate Revisiting the Princeton Stem Cell Data GPBA Workshop Oct. 15, 2003.
The following slides have been adapted from to be presented at the Follow-up course on Microarray Data Analysis.
Annex I: Methods & Tools prepared by some members of the ICH Q9 EWG for example only; not an official policy/guidance July 2006, slide 1 ICH Q9 QUALITY.
Proteomics Informatics – Data Analysis and Visualization (Week 13)
Multiple testing in high- throughput biology Petter Mostad.
(2) Ratio statistics of gene expression levels and applications to microarray data analysis Bioinformatics, Vol. 18, no. 9, 2002 Yidong Chen, Vishnu Kamat,
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Essential Statistics in Biology: Getting the Numbers Right
1 Use of the Half-Normal Probability Plot to Identify Significant Effects for Microarray Data C. F. Jeff Wu University of Michigan (joint work with G.
Introduction to Statistical Quality Control, 4th Edition
Significance analysis of microarrays (SAM) SAM can be used to pick out significant genes based on differential expression between sets of samples. Currently.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
Differential Gene Expression Dennis Kostka, Christine Steinhoff Slides adapted from Rainer Spang.
First approach - repeating a simple analysis for each gene separately - 30k times Assume we have two experimental conditions (j=1,2) We measure.
Carlo Colantuoni – Summer Inst. Of Epidemiology and Biostatistics, 2009: Gene Expression Data Analysis 8:30am-12:00pm in Room W2017.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Chapter 14: Inference about the Model. Confidence Intervals for the Regression Slope (p. 788) If we repeated our sampling and computed another model,
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
Suppose we have T genes which we measured under two experimental conditions (Ctl and Nic) in n replicated experiments t i * and p i are the t-statistic.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop RNA-Seq visualization with cummeRbund.
Statistical Analysis of Microarray Data By H. Bjørn Nielsen.
Extracting binary signals from microarray time-course data Debashis Sahoo 1, David L. Dill 2, Rob Tibshirani 3 and Sylvia K. Plevritis 4 1 Department of.
Cluster validation Integration ICES Bioinformatics.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
For a specific gene x ij = i th measurement under condition j, i=1,…,6; j=1,2 Is a Specific Gene Differentially Expressed Differential expression.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
The Broad Institute of MIT and Harvard Differential Analysis.
1 Significance analysis of Microarrays (SAM) Applied to the ionizing radiation response Tusher, Tibshirani, Chu (2001) Dafna Shahaf.
Microarray Data Analysis The Bioinformatics side of the bench.
Variability & Statistical Analysis of Microarray Data GCAT – Georgetown July 2004 Jo Hardin Pomona College
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
Micro array Data Analysis. Differential Gene Expression Analysis The Experiment Micro-array experiment measures gene expression in Rats (>5000 genes).
Additional file 8: Estimation of biological variations
Significance analysis of microarrays (SAM)
Significance Analysis of Microarrays (SAM)
Statistical Analysis Error Bars
Significance Analysis of Microarrays (SAM)
Pejman Mohammadi, Niko Beerenwinkel, Yaakov Benenson  Cell Systems 
Analysis of Microarray Data Using Z Score Transformation
Differential Expression of RNA-Seq Data
Presentation transcript:

OHRI Bioinformatics Introduction to the Significance Analysis of Microarrays application Stem Cell Network Online Microarray Analysis Course Unit Two Fall 2006

OHRI Bioinformatics Introduction to SAM “assigns a score to each gene on the basis of change in gene expression relative to the standard deviation of repeated measurements.” SAM uses permutations of repeated measurements to estimate the False Discovery Rate Paper available online:

OHRI Bioinformatics Overview Calculate “relative difference” – a value that incorporates the change in expression between conditions and the variation of measurements in each condition Calculate “expected relative difference” – derived from controls generated by permutations of data Plot against each other, set cutoff to identify deviating genes Calculate FDR for chosen cutoff from the control permutations

OHRI Bioinformatics Relative Difference Mean expression of gene i in condition I or U

OHRI Bioinformatics Relative Difference Gene-specific scatter

OHRI Bioinformatics Gene-specific scatter

OHRI Bioinformatics Relative Difference Small positive constant calculated to minimize coefficient of variation.

OHRI Bioinformatics T-test vs. SAM (1) (2) (4) (10) (5) (9) (8) (7) (3) (6)

OHRI Bioinformatics Plotting d(i) vs s(i) Comparing 4 shaded vs 4 non-shaded samples A: Relative differences between irradiated and unirradiated states B: Relative differences between cell lines C: Relative differences between hybridizations (technical replicates) D: Relative differences between ‘balanced’ permutation (Extra control) Relative difference vs. Gene scatter

OHRI Bioinformatics SAM creates controls via permutation Consider permutations of the samples used. In the original paper, looked at 36 balanced permutations where each cell line was represented was represented equally Calculate d p (i) for each permutation p Average all d p (i) to get ‘expected relative difference’: d E (i)

OHRI Bioinformatics Finding significant genes Plot d(i) vs d E (i) Identify genes which deviate from d(i)=d E (i) by more than a threshold,  These do not necessarily have the largest change in expression. Can optimize  with estimate of false positive rate

OHRI Bioinformatics False Discovery Rate Take observed d(i) values for upper and lower cutoffs Find the mean number of genes exceeding these cutoffs in the permuted data - this gives an estimate for FDR

OHRI Bioinformatics SAM Output List of significantly changing genes –Fold changes may be asymmetric Estimated false positive rate for the list

OHRI Bioinformatics SAM Implementations In Bioconductor –R package – ‘siggenes’ From Stanford –Excel plugin –R package – ‘samr’ –

OHRI Bioinformatics Unit 2 Exercises Analysis of array data with SAM in R for Windows Exploration of SAM results Identification significantly changing genes