Applying False Discovery Rate (FDR) Control in Detecting Future Climate Changes ZongBo Shang SIParCS Program, IMAGe, NCAR August 4, 2009.

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Multiple testing and false discovery rate in feature selection
Statistical Modeling and Data Analysis Given a data set, first question a statistician ask is, “What is the statistical model to this data?” We then characterize.
1 Statistics Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html
From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
AP Statistics – Chapter 9 Test Review
Topic 6: Introduction to Hypothesis Testing
07/01/15 MfD 2014 Xin You Tai & Misun Kim
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
Differentially expressed genes
Hypothesis Testing Lecture 4. Examples of various hypotheses The sodium content in Furresøen is x Sodium content in Furresøen is equal to the content.
Introduction to Hypothesis Testing CJ 526 Statistical Analysis in Criminal Justice.
False Discovery Rate Methods for Functional Neuroimaging Thomas Nichols Department of Biostatistics University of Michigan.
Significance Tests P-values and Q-values. Outline Statistical significance in multiple testing Statistical significance in multiple testing Empirical.
Chapter 9 Hypothesis Testing.
Statistics for Microarrays
Descriptive Statistics
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
False Discovery Rate (FDR) = proportion of false positive results out of all positive results (positive result = statistically significant result) Ladislav.
Wfleabase.org/docs/tileMEseq0905.pdf Notes and statistics on base level expression May 2009Don Gilbert Biology Dept., Indiana University
Multiple testing correction
Multiple Testing in the Survival Analysis of Microarray Data
Hypothesis Testing (Statistical Significance). Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample.
Statistics Pooled Examples.
Essential Statistics in Biology: Getting the Numbers Right
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
PowerPoint presentations prepared by Lloyd Jaisingh, Morehead State University Statistical Inference: Hypotheses testing for single and two populations.
Section 9.2 Testing the Mean  9.2 / 1. Testing the Mean  When  is Known Let x be the appropriate random variable. Obtain a simple random sample (of.
Differential Expression II Adding power by modeling all the genes Oct 06.
Computational Biology Jianfeng Feng Warwick University.
Chapter 9 Power. Decisions A null hypothesis significance test tells us the probability of obtaining our results when the null hypothesis is true p(Results|H.
1 False Discovery Rate Guy Yehuda. 2 Outline Short introduction to statistics The problem of multiplicity FDR vs. FWE FDR control procedures and resampling.
False Discovery Rates for Discrete Data Joseph F. Heyse Merck Research Laboratories Graybill Conference June 13, 2008.
Controlling FDR in Second Stage Analysis Catherine Tuglus Work with Mark van der Laan UC Berkeley Biostatistics.
Multiple Testing in Microarray Data Analysis Mi-Ok Kim.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
Back to basics – Probability, Conditional Probability and Independence Probability of an outcome in an experiment is the proportion of times that.
Large sample CI for μ Small sample CI for μ Large sample CI for p
Differential Expressions Classical Methods Lecture Topic 7.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
The Multiple Comparisons Problem in IES Impact Evaluations: Guidelines and Applications Peter Z. Schochet and John Deke June 2009, IES Research Conference.
Statistical Testing with Genes Saurabh Sinha CS 466.
Suppose we have T genes which we measured under two experimental conditions (Ctl and Nic) in n replicated experiments t i * and p i are the t-statistic.
The False Discovery Rate A New Approach to the Multiple Comparisons Problem Thomas Nichols Department of Biostatistics University of Michigan.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
Introduction to Hypothesis Testing
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Testing a Single Mean Module 16. Tests of Significance Confidence intervals are used to estimate a population parameter. Tests of Significance or Hypothesis.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets Adam Kirsch, Michael Mitzenmacher, Havard University Andrea.
Bonferroni adjustment Bonferroni adjustment (equally weighted) – Reject H 0j with p i
False Discovery Rate for Functional Neuroimaging Thomas Nichols Department of Biostatistics University of Michigan Christopher Genovese & Nicole Lazar.
Estimating the False Discovery Rate in Genome-wide Studies BMI/CS 576 Colin Dewey Fall 2008.
1 A Discussion of False Discovery Rate and the Identification of Differentially Expressed Gene Categories in Microarray Studies Ames, Iowa August 8, 2007.
Multiple Testing Methods for the Analysis of Microarray Data
Hypothesis Testing I The One-sample Case
1. Estimation ESTIMATION.
Statistical Testing with Genes
Hypothesis Testing: Hypotheses
Multiple Testing Methods for the Analysis of Gene Expression Data
Chapter 9: Hypothesis Tests Based on a Single Sample
HYPOTHESIS TESTS ABOUT THE MEAN AND PROPORTION
Chapter 9 Model Building
Statistical Analysis and Design of Experiments for Large Data Sets
More About Tests Notes from
Statistical Testing with Genes
STATISTICS HYPOTHESES TEST (I)
Hypothesis Testing for the mean. The general procedure.
Presentation transcript:

Applying False Discovery Rate (FDR) Control in Detecting Future Climate Changes ZongBo Shang SIParCS Program, IMAGe, NCAR August 4, 2009

North American Regional Climate Change Assessment Program (NARCCAP) Predicted Changes in Future Winter Temperature ( °C) Note: This figure shows the difference between the mean of future (2040 – 2069 ) winter temperature vs. current (1970 – 1999) winter temperature.

Can We Trust What We See? Note: Those two figures show the means of 10 replicate random fields that are generated from the same Matèrn semi-variogram model, but with different random seeds.

What’s the Problem with Pointwise Two-sample t Tests?

False Discovery Rate (FDR) Control FDR controls the expected proportion of incorrectly rejected null hypotheses (type I errors) among the rejected null hypotheses. Less conservative than Bonferroni procedures, with greater power than Familywise Error Rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors. Applications of FDR in Genes Expression and Microarray Applications of FDR in Functional Magnetic Resonance Imaging

Definition of False Discovery Rate Declared non- significant (fail to reject) Declared significant (reject) Total True null hypotheses UVm₀ Non-true null hypotheses TSm-m₀ m-RRm Let Q = V / (V + S) define the proportion of errors committed by falsely rejecting null hypotheses. Notice Q is an unobservable random variable. Define the FDR to be the expectation of Q:

False Discovery Rates for Spatial Signals Testing on clusters rather than individual locations Procedure 1: Weighted Benjamini & Hochberg (BH) procedure Procedure 2: Weighted two-stage procedure Procedure 3: Hierarchical testing procedure – Testing stage: control FDR on clusters – Trimming stage: control FDR on selected points Reference: Benjamini, Y. and Heller, R False discovery rates for spatial signals. Journal of the American Statistical Association. 102:

Simulation Studies 1. Random Fields 2. Random Field Block 3. Random Field Gradient

Simulation Study I: Two Random Fields Note: Those two figures show the means of 10 replicate random fields that are generated from the same Matèrn semi-variogram model, but with different random seeds.

Pre-defined Clusters

Simulation Study 1: Pointwise vs. False Discover Rate Control

9 Repeats on Simulation Study I

Simulation Study II: Pre-defined Block Trend

Simulation Study II: Average of 10 Replicates Random Field (Matèrn, σ = 0.4) + Block Trends

Simulation Study II: Pointwise vs. False Discover Rate Control

9 Repeats on Simulation Study II

Study III: Pre-defined Gradient Trend

Study III: Average of 10 Replicates Random Field (Matèrn, σ = 2) + Gradient Trends

Simulation Study III: Pointwise vs. False Discover Rate Control

9 Repeats on Simulation Study III

Applying FDR Control for Detecting Future Climate Changes Download climate datasets from NARCCAP program Calculate seasonal average Construct clusters from EPA Eco-regions Conduct two-sample t test on temperature/precipitation Pointwise p-values and corresponding z scores Build semi-variogram model to estimate spatial autocorrelation Calculate z score and p-value by cluster Reject clusters based on FDR control

GIS: Vector Dataset, Lambert Equal-Area Projection

61 regions rejected at q=0.25 level 56 regions rejected at q=0.1 level 54 regions rejected at q=0.05 level 51 regions rejected at q=0.01 level H 0 : Future Winter Temperature Increase by 3 ˚C

H 0 : Winter Temperature ↑ 1 ˚CH 0 : Winter Temperature ↑ 2 ˚CH 0 : Winter Temperature ↑ 3 ˚C H 0 : Winter Temperature ↑ 4 ˚CH 0 : Winter Temperature ↑ 6 ˚CH 0 : Winter Temperature ↑ 5 ˚C FDR Tests on Winter Temperature

H 0 : Winter Prec ↓ 20 Kg/ m²H 0 : ↓ 10 Kg/ m²H 0 : ↑ 10 Kg/ m²H 0 : ↑ 20 Kg/ m² H 0 : ↑ 50 Kg/ m²H 0 : Winter Prec ↑ 30 Kg/ m²H 0 : ↑ 75 Kg/ m²H 0 : ↑ 100 Kg/ m² FDR Tests on Winter Precipitation

Acknowledgement Dr. Steve Sain, IMAGe, NCAR Drs. Douglas Nychka, Tim Hoar, IMAGe, NCAR Dr. Armin Schwartzman, Harvard University University of Wyoming SIParCS, IMAGe, NCAR NARCCAP