Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 Outline Background subtraction Probeset statistics Excursions into.

Slides:



Advertisements
Similar presentations
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Advertisements

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Microarray Quality Assessment Issues in High-Throughput Data Analysis BIOS Spring 2010 Dr Mark Reimers.
ABRF meeting 09 Light Microscopy Research Group. Why are there no standards? Imaging was largely an ultrastructure tool Digital imaging only common in.
Measurement of charmonia at mid-rapidity at RHIC-PHENIX  c  J/   e + e -  in p+p collisions at √s=200GeV Susumu Oda CNS, University of Tokyo For.
Gene Expression Index Stat Outline Gene expression index –MAS4, average –MAS5, Tukey Biweight –dChip, model based, multi-array –RMA, model.
Microarray Normalization
1Ellen L. Walker Edges Humans easily understand “line drawings” as pictures.
Yanxin Shi 1, Fan Guo 1, Wei Wu 2, Eric P. Xing 1 GIMscan: A New Statistical Method for Analyzing Whole-Genome Array CGH Data RECOMB 2007 Presentation.
Statistical Methods in Microarray Data Analysis Mark Reimers, Genomics and Bioinformatics, Karolinska Institute.
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
Getting the numbers comparable
Probe Level Analysis of AffymetrixTM Data
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Low-Level Analysis and QC Regional Biases Mark Reimers, NCI.
Figure 1: (A) A microarray may contain thousands of ‘spots’. Each spot contains many copies of the same DNA sequence that uniquely represents a gene from.
Low Level Statistics and Quality Control Javier Cabrera.
SNP chips Advanced Microarray Analysis Mark Reimers, Dept Biostatistics, VCU, Fall 2008.
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
Statistical Analysis of the Two Group Post-Only Randomized Experiment.
Microarray Data Analysis Data quality assessment and normalization for affymetrix chips.
Microarray Data Analysis Data quality assessment and normalization for affymetrix chips.
Theoretical and experimental comparisons of gene expression indexes for oligonucleotide microarrays Division of Human Cancer Genetics Ohio State University.
GeneChips and Microarray Expression Data
Color Vision: Sensing a Colorful World
Summaries of Affymetrix GeneChip probe level data By Rafael A. Irizarry PH 296 Project, Fall 2003 Group: Kelly Moore, Amanda Shieh, Xin Zhao.
Microarray Preprocessing
1 Activity and Motion Detection in Videos Longin Jan Latecki and Roland Miezianko, Temple University Dragoljub Pokrajac, Delaware State University Dover,
CDNA Microarrays MB206.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Physics 270 – Experimental Physics. Standard Deviation of the Mean (Standard Error) When we report the average value of n measurements, the uncertainty.
Lo w -Level Analysis of Affymetrix Data Mark Reimers National Cancer Institute Bethesda Maryland.
Lecture Topic 5 Pre-processing AFFY data. Probe Level Analysis The Purpose –Calculate an expression value for each probe set (gene) from the PM.
TobiasEcon 472 Law of Large Numbers (LLN) and Central Limit Theorem (CLT)
Summarization of Oligonucleotide Expression Arrays BIOS Winter 2010.
Model-based analysis of oligonucleotide arrays, dChip software Statistics and Genomics – Lecture 4 Department of Biostatistics Harvard School of Public.
Point Source Search with 2007 & 2008 data Claudio Bogazzi AWG videconference 03 / 09 / 2010.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Measures of variability: understanding the complexity of natural phenomena.
Robust Estimators.
Image-Based Segmentation of Indoor Corridor Floors for a Mobile Robot Yinxiao Li and Stanley T. Birchfield The Holcombe Department of Electrical and Computer.
Autonomous Robots Vision © Manfred Huber 2014.
Edges.
Reaction-Diffusion Systems Reactive Random Walks.
A Routine Approach to Quality Control Peter Haberl
Statistical Analyses of High Density Oligonucleotide Arrays Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry.
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Retina Retina covered with light sensitive receptors –RODS Primarily for night vision and movement Sensitive to broad spectrum of light.
Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Testbeam analysis Lesya Shchutska. 2 beam telescope ECAL trigger  Prototype: short bars (3×7.35×114 mm 3 ), W absorber, 21 layer, 18 X 0  Readout: Signal.
In conclusion the intensity level of the CCD is linear up to the saturation limit, but there is a spilling of charges well before the saturation if.
Introduction to Affymetrix GeneChip data
Volume 89, Issue 2, Pages (August 2005)
How to Start This PowerPoint® Tutorial
The Basics of Microarray Image Processing
Volume 129, Issue 2, Pages (April 2007)
A Statistical Description of Plant Shoot Architecture
Getting the numbers comparable
Segmentation by fitting a model: robust estimators and RANSAC
A Statistical Description of Plant Shoot Architecture
A Map for Horizontal Disparity in Monkey V2
Normalization for cDNA Microarray Data
Volume 77, Issue 6, Pages (March 2013)
Volume 89, Issue 2, Pages (August 2005)
Volume 72, Issue 6, Pages (December 2011)
Volume 12, Issue 9, Pages (April 2002)
Lecture 3 From Images to Data
Pre-processing AFFY data
Presentation transcript:

Felix Naef & Marcelo Magnasco, GL meeting, Nov Outline Background subtraction Probeset statistics Excursions into GeneChip data analysis

Background estimation estimate both mean B and fluctuations  needed in low-intensity regime includes light reflection from substrate, photodetector dark current, some cross- hybridization (i.e. small residues) by the CLT, background is expected to be a Gaussian variable

idea: B is insensitive to MM and visible at low intensity select probes such that |PM-MM| <  (locally?) use  =50 (new) or 100 (old settings) P(PM) or P(MM) is convolution of Gaussian and step function “+” = 0 B  B Real P(PM)

example:  ) dependence on 

trick for dealing with negative values

PM vs. MM distribution MM>PM make a histogram in this region make a histogram in this region zoom

PM vs. MM histogram

MM>PM across different chips MM>PM not concentrated at low intensities: 27% of probe pairs with MM>PM are in the top quartile

probe pairs trajectories (~80 chips) take all (PM, MM) for a given probe set center of mass (x,y) ellipsoid of inertia >    and   histogram the cm’s color code acc. to s =   /  (min(x, y  ~ noise detrending

all probe sets blue : large s green : mid red : small

probes with ‘well’ defined trajectories (eccentricity > 3) ~1/3 of probes blue : large green : mid red : small

PM within a probe set Are the brightness of the probes reasonably uniform? Or do different probes have very different hybridization efficiencies?

So what can possibly be happening? sequence dependent hybridization efficiencies are kinetic effects important? cross-hybridization beyond what is detectable by MM probes this is hard to assess without sequence info sequence dependent fabrication efficiencies? variable probe densities

Composite scores What have we learned from previous slides? MM are not consistently behaving as expected -What about not using them ? The probe set intensities vary over decades -difficult to estimate absolute intensities using ‘averages’ (alternative: Li and Wong) - we focus on ratio scores

Outline of algorithm 1.estimate background (mean and std) 2.discard noisy and saturated probes use either only PM or PM-MM as raw intensities 3.average the remaining log-ratios in an outlier robust way (robust regression to intercept), SE 4.normalize by centering (event. local) log- ratio distribution