Circular analysis in systems neuroscience – with particular attention to cross-subject correlation mapping Nikolaus Kriegeskorte Laboratory of Brain and.

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Hypothesis testing Another judgment method of sampling data.
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
A (very) brief introduction to multivoxel analysis “stuff” Jo Etzel, Social Brain Lab
Chapter 16: Inference in Practice STAT Connecting Chapter 16 to our Current Knowledge of Statistics ▸ Chapter 14 equipped you with the basic tools.
Classical inference and design efficiency Zurich SPM Course 2014
1. Estimation ESTIMATION.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Evaluating Hypotheses
Multiple comparison correction Methods & models for fMRI data analysis 29 October 2008 Klaas Enno Stephan Branco Weiss Laboratory (BWL) Institute for Empirical.
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
The Experimental Approach September 15, 2009Introduction to Cognitive Science Lecture 3: The Experimental Approach.
Experimental Evaluation
Inferences About Process Quality
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Today Concepts underlying inferential statistics
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Issues with analysis and interpretation - Type I/ Type II errors & double dipping - Madeline Grade & Suz Prejawa Methods for Dummies 2013.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
General Linear Model & Classical Inference
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
Choosing Statistical Procedures
General Linear Model & Classical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM M/EEGCourse London, May.
AM Recitation 2/10/11.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Lucio Baggio - Lucio Baggio - False discovery rate: setting the probability of false claim of detection 1 False discovery rate: setting the probability.
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Chapter 1: Introduction to Statistics
Basics of fMRI Inference Douglas N. Greve. Overview Inference False Positives and False Negatives Problem of Multiple Comparisons Bonferroni Correction.
Academic Viva POWER and ERROR T R Wilson. Impact Factor Measure reflecting the average number of citations to recent articles published in that journal.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
PARAMETRIC STATISTICAL INFERENCE
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
MVPA Tutorial Last Update: January 18, 2012 Last Course: Psychology 9223, W2010, University of Western Ontario Last Update:
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
FMRI ROI Analysis 7/18/2014 Friday Yingying Wang
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 9 1 MER301:Engineering Reliability LECTURE 9: Chapter 4: Decision Making for a Single.
One-way ANOVA: - Comparing the means IPS chapter 12.2 © 2006 W.H. Freeman and Company.
1 9 Tests of Hypotheses for a Single Sample. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 9-1.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Issues concerning the interpretation of statistical significance tests.
Chapter 16: Inference in Practice STAT Connecting Chapter 16 to our Current Knowledge of Statistics ▸ Chapter 14 equipped you with the basic tools.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
Statistical Analysis An Introduction to MRI Physics and Analysis Michael Jay Schillaci, PhD Monday, April 7 th, 2007.
Multiple comparisons problem and solutions James M. Kilner
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Review Statistical inference and test of significance.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Kriegeskorte N, Simmons WK, Bellgowan PSF, Baker CI. Circular analysis in systems neuroscience – the dangers of double dipping slides by Nikolaus Kriegeskorte,
Multivariate Pattern Analysis of fMRI data. Goal of this lecture Introduction of basic concepts & a few commonly used approaches to multivariate pattern.
Unit 5: Hypothesis Testing
Central Limit Theorem, z-tests, & t-tests
Contrasts & Statistical Inference
Significance Tests: The Basics
CHAPTER 10 Comparing Two Populations or Groups
Contrasts & Statistical Inference
Contrasts & Statistical Inference
Presentation transcript:

Circular analysis in systems neuroscience – with particular attention to cross-subject correlation mapping Nikolaus Kriegeskorte Laboratory of Brain and Cognition, National Institute of Mental Health

Chris I Baker W Kyle Simmons Patrick SF Bellgowan Peter Bandettini Collaborators

Part 1 General introduction to circular analysis in systems neuroscience (synopsis of Kriegeskorte et al. 2009) Part 2 Specific issue: selection bias in cross-subject correlation mapping (following up on Vul et al. 2009) Overview

dataresults

analysis dataresults

dataresults analysis

dataresults analysis assumptions

dataresults analysis assumptions

Circular inference dataresults analysis assumptions

Circular inference dataresults analysis assumptions

How do assumptions tinge results? Elimination (binary selection) Weighting (continuous selection) Sorting (multiclass selection) – Through variants of selection!

dataresults analysis assumptions: selection criteria Elimination (binary selection)

Example 1 Pattern-information analysis

Experimental design “Animate?”“Pleasant?” STIMULUS (object category) TASK (property judgment) Simmons et al. 2006

define ROI by selecting ventral-temporal voxels for which any pairwise condition contrast is significant at p<.001 (uncorr.) perform nearest-neighbor classification based on activity-pattern correlation use odd runs for training and even runs for testing Pattern-information analysis

decoding accuracy task (judged property) stimulus (object category) Results chance level

fMRI data using all data to select ROI voxels using only training data to select ROI voxels data from Gaussian random generator decoding accuracy chance level task stimulus...but we used cleanly independent training and test data! ? !

Conclusion for pattern-information analysis The test data must not be used in either... training a classifier or defining the ROI continuous weighting binary weighting

Data selection is key to many conventional analyses. Can it entail similar biases in other contexts?

Example 2 Regional activation analysis

ROI definition is affected by noise true region overfitted ROI ROI-average activation overestimated effect independent ROI

Data sorting dataresults analysis assumptions: sorting criteria

Set-average tuning curves stimulus parameter (e.g. orientation) response...for data sorted by tuning noise data

ROI-average fMRI response ABCD condition Set-average activation profiles...for data sorted by activation noise data

To avoid selection bias, we can......perform a nonselective analysis OR...make sure that selection and results statistics are independent under the null hypothesis, because they are either: inherently independent or computed on independent data e.g. independent contrasts e.g. whole-brain mapping (no ROI analysis)

Does selection by an orthogonal contrast vector ensure unbiased analysis? ROI-definition contrast: A+B ROI-average analysis contrast: A-B c selection =[1 1] T c test =[1 -1] T orthogonal contrast vectors 

Does selection by an orthogonal contrast vector ensure unbiased analysis? not sufficient contrast vector The design and noise dependencies matter.designnoise dependencies – No, there can still be bias. still not sufficient

Circular analysis Pros highly sensitive widely accepted (examples in all high-impact journals) doesn't require independent data sets grants scientists independence from the data allows smooth blending of blind faith and empiricism Cons

Circular analysis Pros highly sensitive widely accepted (examples in all high-impact journals) doesn't require independent data sets grants scientists independence from the data allows smooth blending of blind faith and empiricism Cons

Circular analysis Pros highly sensitive widely accepted (examples in all high-impact journals) doesn't require independent data sets grants scientists independence from the data allows smooth blending of blind faith and empiricism Cons [can’t think of any right now] Pros the error that beautifies results confirms even incorrect hypotheses improves chances of high-impact publication

Part 2 Specific issue: selection bias in cross-subject correlation mapping (following up on Vul et al. 2009)

Motivation Vul et al. (2009) posed a puzzle: Why are the cross-subject correlations found in brain mapping so high? Selection bias is one piece of the puzzle. But there are more pieces and we have yet to put them all together.

Overview List and discuss six pieces of the puzzle. (They don't all point in the same direction!) Suggest some guidelines for good practice.

Six pieces synopsis 1.Cross-subject correlation estimates are very noisy. 2.Bin or within-subject averaging legitimately increases correlations. 3.Selecting among noisy estimates yields large biases. 4.False-positive regions are highly likely for a whole- brain mapping thresholded at p<.001, uncorrected. 5.Reported correlations are high, but not highly significant. 6.Studies have low power for finding realistic correlations in the brain if multiple testing is appropriately accounted for.

Vul et al. 2009,, population The geometric mean of the reliability is an upper bound on the population correlation. The reliabilities provide no bound on the sample correlation. noise-free correlation

Sample correlations across small numbers of subjects are very noisy estimates of population correlations. Piece 1

0.65

correlation 10 subjects 95%-confidence interval Cross-subject correlation estimates are very noisy

The more we average (reducing noise but not signal), the higher correlations become. Piece 2

Bin-averaging inflates correlations

Subjects are like bins... For each subject, all data is averaged to give one number. Take-home message Cross-subject correlation estimates are expected to be... high (averaging all data for each subject) noisy (low number of subjects) So what's Ed fussing about? We don't need selection bias to explain the high correlations, right?

Selecting the maximum among noisy estimates yields large selection biases. Piece 3

Expected maximum correlation selected among null regions expected maximum correlation 16 subjects bias

False-positive regions are likely to be found in whole-brain mapping using p<.001, uncorrected. Piece 4

Mapping with p<.001, uncorrected Global null hypothesis is true (population correlation = 0 in all brain locations)

Reported correlations are high, but not highly significant. Piece 5

Reported correlations are high, but not highly significant p< p<0.001 p<0.01 p<0.05 one-sided two-sided correlation thresholds as a function of the number of subjects

Reported correlations are high, but not highly significant p< p<0.001 p<0.01 p<0.05 one-sided two-sided correlation thresholds as a function of the number of subjects

Reported correlations are high, but not highly significant p< p<0.001 p<0.01 p<0.05 one-sided two-sided correlation thresholds as a function of the number of subjects (assuming each study reports the maximum of 500 independent brain locations) What correlations would we expect under the global null hypothesis?

Reported correlations are high, but not highly significant p< p<0.001 p<0.01 p<0.05 one-sided two-sided (assuming each study reports the max. of 500 independent brain locations) What correlations would we expect under the global null hypothesis?

Most of the studies have low power for finding realistic correlations with whole-brain mapping if multiple testing is appropriately accounted for. Piece 6 see also: Yarkoni 2009

Numbers of subjects in studies reviewed by Vul et al. (2009) number of correlations estimates number of subjects

power In order to find a single region with a cross-subject correlation of 0.7 in the brain......we would need about 36 subjects 16 subjects

power In order to find a single region with a cross-subject correlation of 0.7 in the brain......we would need about 36 subjects 16 subjects

Take-home message Whole-brain cross-subject correlation mapping with 16 subjects doesnotwork. Need at least twice as many subjects.

Conclusions Unless much larger numbers of subjects are used, whole-brain cross-subject correlation mapping suffers from either: –very low power to detect true regions (if we carefully to correct for multiple comparisons) –very high rates of false-positive regions (otherwise) If analysis is circular, selection bias is expected to be high here (because selection occurs among noisy estimates)....in other words, it doesn't work.

Suggestions Design study to have enough power to detect realistic correlations. (Need either anatomical restrictions or large numbers of subjects.) Consider studying trial-to-trial rather than subject-to- subject effects. Correct for multiple testing to avoid false positives. Avoid circularity: Use leave-one-subject out procedure to estimate regional cross-subject correlations. Report correlation estimates with error bars.