Pan Du, Simon Lin Robert H. Lurie Comprehensive Cancer Center

Slides:



Advertisements
Similar presentations
Statistical Issues in the Design of Microarray Experiments Lara Lusa U.O. Statistica Medica e Biometria Istituto Nazionale per lo Studio e la Cura dei.
Advertisements

P. J. Munson, National Institutes of Health, Nov. 2001Page 1 A "Consistency" Test for Determining the Significance of Gene Expression Changes on Replicate.
Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint.
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
ECS 289A Presentation Jimin Ding Problem & Motivation Two-component Model Estimation for Parameters in above model Define low and high level gene expression.
Darya Chudova, Alexander Ihler, Kevin K. Lin, Bogi Andersen and Padhraic Smyth BIOINFORMATICS Gene expression Vol. 25 no , pages
Microarray Quality Assessment Issues in High-Throughput Data Analysis BIOS Spring 2010 Dr Mark Reimers.
LimmaGUI A Point-and-Click Interface for cDNA Microarray Analysis James Wettenhall and Gordon Smyth Division of Genetics and Bioinformatics Walter and.
Gene Expression Index Stat Outline Gene expression index –MAS4, average –MAS5, Tukey Biweight –dChip, model based, multi-array –RMA, model.
Microarray Normalization
Normalization of microarray data
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Development, Implementation and Testing of a DNA Microarray Test Suite Ehsanul Haque Mentors: Dr. Cecilie Boysen Dr. Jim Breaux ViaLogy Corp.
Getting the numbers comparable
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
More On Preprocessing Javier Cabrera. Outline 1.Transform the data into a scale suitable for analysis. 2.Remove the effects of systematic and obfuscating.
GCB/CIS 535 Microarray Topics John Tobias November 8th, 2004.
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
Two-Color Microarrays: Reference Designs and Reference RNAs. Kathleen Kerr Department of Biostatistics University of Washington Collaborators: Kyle Serikawa,
Practical Issues in Microarray Data Analysis Mark Reimers National Cancer Institute Bethesda Maryland.
A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Agenda Introduction to microarrays
Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
A Short Overview of Microarrays Tex Thompson Spring 2005.
Microarray Normalization Issues in High-Throughput Data Analysis BIOS Spring 2010 Dr Mark Reimers.
Lecture Topic 5 Pre-processing AFFY data. Probe Level Analysis The Purpose –Calculate an expression value for each probe set (gene) from the PM.
Summarization of Oligonucleotide Expression Arrays BIOS Winter 2010.
Statistics for Differential Expression Naomi Altman Oct. 06.
Statistical Analysis of Microarray Data By H. Bjørn Nielsen.
For a specific gene x ij = i th measurement under condition j, i=1,…,6; j=1,2 Is a Specific Gene Differentially Expressed Differential expression.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
1 Estimation of Gene-Specific Variance 2/17/2011 Copyright © 2011 Dan Nettleton.
Statistical Analyses of High Density Oligonucleotide Arrays Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry.
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Microarray Data Analysis Xuming He Department of Statistics University of Illinois at Urbana-Champaign.
Bioinformatics for biologists (2) Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise.
DNA Microarray. Microarray Printing 96-well-plate (PCR Products) 384-well print-plate Microarray.
Statistics Behind Differential Gene Expression
Estimation of Gene-Specific Variance
CellExpress Tutorial A Comprehensive Microarray-Based Cancer Cell Line and Clinical Sample Gene Expression Analysis Online System :8080 NTU.
Introduction to Affymetrix GeneChip data
Canadian Bioinformatics Workshops
RNA-Seq analysis in R (Bioconductor)
Optimizing Biological Data Integration
Statistical Applications in Biology and Genetics
Differential Gene Expression
CDNA-Project cDNA project Julia Brettschneider (UCB Statistics)
Psychology 202a Advanced Psychological Statistics
CellExpress Examples A Comprehensive Microarray-Based Cancer Cell Line and Clinical Sample Gene Expression Analysis Online System :8080 NTU.
Significance Analysis of Microarrays (SAM)
New normalisation methods for microarrays
Correlation of log-transformed signal intensity from two Affymetrix microarray hybridizations using platelet RNA. Plotted are those probesets with an average.
Gene Expression Analysis and Proteins
DNA Chip Data Interpretation Tools: Genmapp & Dragon View
Significance Analysis of Microarrays (SAM)
miRNA expression patterns in stools from healthy subjects.
Getting the numbers comparable
Product moment correlation
Gene Expression Analysis
Course: Statistics in Bioinformatics Date: 指導教授: 陳光琦 學生: 吳昱賢
Other genomic arrays: Methylation, chIP on chip…
PD-L1 expression correlates with T-cell markers and an IFN response signature in human melanomas. PD-L1 expression correlates with T-cell markers and an.
Pre-processing AFFY data
Presentation transcript:

Pan Du, Simon Lin Robert H. Lurie Comprehensive Cancer Center 2/23/2019 Cross-site and Cross-platform Concordance of Microarray Analysis Improved by Variance Stabilization Pan Du, Simon Lin Robert H. Lurie Comprehensive Cancer Center

Why Variance Stabilization? How to Stabilize Variance? 2/23/2019 Outline Why Variance Stabilization? How to Stabilize Variance? Illumina Affymetrix Does it work? 2/23/2019

Introduction of Microarray Studies normal cancer A A Array x Array y Array x Array y Biomedical Applications Quality Control Studies (Johnson and Lin, Nature 411:885, 2001) 2/23/2019

Evaluation criterion of reproducibility: Concordance Lab A Lab B Gene list A Gene list B Anything in common? % in common number of genes selected ideal better worse 100 FDA-led Quality Control Study cross-time cross-site cross-platform (Tong et al., Nature Biotech 24:1132, 2006) 2/23/2019

General Microarray Analysis Procedure 2/23/2019 General Microarray Analysis Procedure Sample preparation Microarray experiment and data collection Background adjustment Transformation Normalization Gene identification (log2) 2/23/2019

Why Variance Stabilization? x-y plot mean-var plot Ideal raw x log2 (x) log2 (x+offset) 2/23/2019

Why do we care? A general assumption of statistical tests to microarray data: variance is independent of intensity Gene A: 7 (normal) → 8 (cancer) Gene B: 13 (normal) → 14 (cancer) 2/23/2019

Variance Stabilization: the model 2/23/2019 Variance Stabilization: the model A mathematical model of microarray hybridization (Rocke and Durbin, Bioinformatics 19:996, 2003) 2/23/2019

Variance Stabilization: deriving h(y) 2/23/2019 Variance Stabilization: deriving h(y) Asymptotic variance-stabilizing transformation can be achieved by (Tibshirani, JASA, 1988) 2/23/2019

VSN (Variance Stabilizing Normalization) 2/23/2019 Huber’s Solution (2002) VSN (Variance Stabilizing Normalization) Estimate the mean and variance from a set of arrays Assume most genes are not differentially expressed Technically challenging because the normalization between arrays has to be considered Practically challenging because usually we have only 2 ~ 6 arrays (Huber et al., Bioinformatics, 2002) 2/23/2019

Illumina BeadArray Technology 2/23/2019 Illumina BeadArray Technology Larger than 30 technique replicates are on each array. Beads are randomly assembled and held in these microwells Multiple arrays on the same slide Cost: < $200 2/23/2019

Variance Stabilizing Transformation (VST) 2/23/2019 Variance Stabilizing Transformation (VST) Fit the relations between mean and standard deviation Relations between log2 and VST (arcsinh) 2/23/2019 (Lin, Pan, Huber, and Warren, 2007)

Variance Stabilization of the Technical Replicates 2/23/2019 Variance Stabilization of the Technical Replicates 2/23/2019

Comparison of Log2, VSN and VST 2/23/2019

Barnes data: (Barnes, M., et al., 2005) 2/23/2019 Evaluation Data Sets Barnes data: (Barnes, M., et al., 2005) measured a dilution series (two replicates and six dilution ratios) of two human tissues: blood and placenta. MAQC-I: (Shippy, R., et al., 2006) Similar dilution series, conducted at more than one microarray facilities using both Illumina and Affymetrix platforms 2/23/2019

Experiment Design 2/23/2019

Cross-site concordance evaluation 2/23/2019 Cross-site concordance evaluation MAQC data VST improves the cross-site concordance 2/23/2019

Hypothesis: VST also works for Affymetrix arrays 2/23/2019 VST for Affymetrix Hypothesis: VST also works for Affymetrix arrays Treat each pixel as a technical replicate Model the mean and variance the same way 2/23/2019

Cross-site concordance for Affymetrix 2/23/2019 Cross-site concordance for Affymetrix 2/23/2019

Cross-platform: Affymetrix and Illumina 2/23/2019 Cross-platform: Affymetrix and Illumina Evaluation procedure Comparing sample C and D in the MAQC study The probe ids were first mapped to the Entrez IDs. Legend notation “Current”: RMA (affymetrix), Log2+Quantile (Illumina) “Improved”: VST+RMA (affymetrix); VST+Quantile 2/23/2019

Bioconductor lumi package 2/23/2019 Bioconductor lumi package The VST and related algorithms are included in the Bioconduction lumi package Bioconductor: http://www.bioconductor.org 2/23/2019

Robert H. Lurie Comprehensive Cancer Center, Northwestern University 2/23/2019 Acknowledgements Robert H. Lurie Comprehensive Cancer Center, Northwestern University Warren A. Kibbe and other members in the Bioinformatics group Denise Scholtens, Biostatistics European Bioinformatics Institute Wolfgang Huber The Walter and Eliza Hall Institute of Medical Research, Australia Gordon Smyth 2/23/2019