Presentation on theme: "Basics of Experimental Design for fMRI: Block Designs Last Update: February 4, 2013 Last Course: Psychology 9223, W2013, Western."— Presentation transcript:
Basics of Experimental Design for fMRI: Block Designs http://www.fmri4newbies.com/ Last Update: February 4, 2013 Last Course: Psychology 9223, W2013, Western University Jody Culham Brain and Mind Institute Department of Psychology University of Western Ontario
Attending a poster session at a recent meeting, I was reminded of the old adageTo the man who has only a hammer, the whole world looks like a nail. In this case, however, instead of a hammer we had a magnetic resonance imaging (MRI) machine and instead of nails we had a study. Many of the studies summarized in the posters did not seem to be designed to answer questions about the functioning of the brain; neither did they seem to bear on specific questions about the roles of particular brain regions. Rather, they could best be described as exploratory. People were asked to engage in some task while the activity in their brains was monitored, and this activity was then interpreted post hoc. -- Stephen M. Kosslyn (1999). If neuroimaging is the answer, what is the question? Phil Trans R Soc Lond B, 354, 1283-1294.
Brains Needed "...the single most critical piece of equipment is still the researcher's own brain. All the equipment in the world will not help us if we do not know how to use it properly, which requires more than just knowing how to operate it. Aristotle would not necessarily have been more profound had he owned a laptop and known how to program. What is badly needed now, with all these scanners whirring away, is an understanding of exactly what we are observing, and seeing, and measuring, and wondering about." -- Endel Tulving, interview in Cognitive Neuroscience (2002, Gazzaniga, Ivry & Mangun, Eds., NY: Norton, p. 323)
Expensive equipment doesnt merit a lousy study. -- Louis Sokoloff Toys Are Not Enough
Localization for Localizations Sake Clinical Research –presurgical planning –understanding functional reorganization Basic Research –implicating well-established areas in cognitive tasks Danckert et al., 2004, Neuropsychologia
Danger Zone: Reverse Inference [Mitt Romneys] still photos prompted a significant amount of activity in the amygdala, indicating voter anxiety… Iacoboni et al., 2007, New York Times Op-Ed, This is Your Brain on Politics
The Romney amygdala activation might indicate anxiety, or any number of other feelings that are associated with the amygdala -- anger, happiness or even sexual excitement Martha Farah, Neuroethics and Law Blog Danger Zone: Reverse Inference
Why You Shouldnt Use fMRI Its the most expensive approach If youre interested in behavior, study behavior EEG/ERP/MEG have better temporal resolution TMS and neuropsychology speak more directly to causality –fMRI activation may be epiphenomonal neurophysiology/eCoG give more direct access to neural processing Epiphenomena Huettel et al. fMRI
Behaviorism (circa 1950s) Stimulus Black Box Response
Cognitive Science (circa 1980s) StimulusResponsePerception Attention Recognition Memory Decision Making Too many empty boxes?
The Future of fMRI? Map of semantic space derived from fMRI Gallant Lab Networks
What Can fMRI Add? explicit testing of models derived from other approaches inform and constrain theories of cognition whole brain coverage that can constrain or direct data from other approaches (neurophysiology, ERPs, ECoG) investigation of neural mechanisms of elaborated human functions (language, math, tool use) correlations between brain and behavior help to understand clinical disorders or development look at coding and connections between brain regions
What have we learned about the face area? The face area is activated: when faces are perceived or imagined correlation between brain and behavior for stimuli at the fovea cues to brain organization by circular patterns cues/constraints for modelling in certain areas of the monkey brain cues to brain evolution for other categories of objects that subjects have extensive experience with debate regarding nature/nurture to some degree by other categories of objects debate regarding distributed vs. modular coding in the brain The fusiform face area may be impaired: in some but not all patients who have problems recognizing faces in people with autism understanding of brain disorders
So you want to do an fMRI study? CONCLUSION: Unless you are Bill Gates, a thought experiment is much more efficient! Average cost of performing a thought experiment: Your Salary Typical cost of performing an fMRI experiment:
Thought Experiments What do you hope to find? What would that tell you about the cognitive process involved? Would it add anything to what is already known from other techniques? Could the same question be asked more easily, more cheaply or better with other techniques? What would be the alternative outcomes (and/or null hypothesis)? Or is there not really any plausible alternative (in which case the experiment may not be worth doing)? If the alternative outcome occurred, would the study still be interesting? If the alternative outcome is not interesting, is the hoped-for outcome likely enough to justify the attempt? What would the headline be if it worked? Is it sexy enough to warrant the time, funding and effort? Ideas are cheap. -- Jodys former supervisor, Jane Raymond Good experimenters generate many ideas and ensure that only the fittest survive What are the possible confounds? Can you control for those confounds? Has the experiment already been done? A year of research can save you an hour on PubMed!
Three Stages of an Experiment Sledgehammer Approach brute force experiment powerful stimulus dont try to control for everything run a couple of subjects -- see if it looks promising if it doesnt look great, tweak the stimulus or task try to be a subject yourself so you can notice any problems with stimuli or subject strategies Real Experiment at some point, you have to stop changing things and collect enough subjects run with the same conditions to publish it incorporate appropriate control conditions random effects analysis requires at least 10 subjects can run all subjects in one or two days pro: minimize setup and variability con: bad magnet day means a lot of wasted time Whipped Cream after the real experiment works, then think about a whipped cream version going straight to whipped cream is a huge endeavor, especially if youre new to imaging mixed metaphor: never sacrifice the meat & potatoes to get the gravy (never sacrifice the hot chocolate to get the whipped cream doesnt have quite the same punch)
Testing Patients fMRI is the art of the barely possible neuropsychology is the art of the barely possible combining fMRI and neuropsychology can be very valuable BUT its (art of the barely possible) 2 If you want to test a paradigm in patients or special groups (either single cases or group studies), I recommend developing a robust paradigm in control subjects first Its generally a bad idea to use patients for pilot testing
Mental Chronometry use reaction times to infer cognitive processes fundamental tool for behavioral experiments in cognitive science F. C. Donders Dutch physiologist 1818-1889
Classic Example Detect Stimulus Press Button Detect Stimulus Press Button Discriminate Color Detect Stimulus Press Button Discriminate Color Choose Button Time T3: Choice Reaction Time Hit left button when light is green and right button when light is red T1: Simple Reaction Time Hit button when you see a light T2: Discrimination Reaction Time Hit button when light is green but not red
Subtraction Logic Detect Stimulus Press Button T1 Detect Stimulus Press Button Discriminate Color T2 - Discriminate Color =
Detect Stimulus Press Button Discriminate Color T2 - = Detect Stimulus Press Button Discriminate Color Choose Button T3 Choose Button Subtraction Logic
Limitations of Subtraction Logic Assumption of pure insertion You can insert a component process into a task without disrupting the other components Widely criticized
Top Ten Things Sex and Brain Imaging Have in Common 10. It's not how big the region is, it's what you do with it. 9. Both involve heavy PETting. 8. It's important to select regions of interest. 7. Experts agree that timing is critical. 6. Both require correction for motion. 5. Experimentation is everything. 4. You often can't get access when you need it. 3. You always hope for multiple activations. 2. Both make a lot of noise. 1. Both are better when the assumption of pure insertion is met. Source: students in the Dartmouth McDonnell-Pew Summer Institute Now you should get this joke!
Subtraction Logic: Brain Imaging Example Hypothesis (circa early 1990s): Some areas of the brain are specialized for perceiving objects Simplest design: Compare pictures of objects vs. a control stimulus that is not an object minus = object perception seeing pictures like seeing pictures like Malach et al., 1995, PNAS
Other Differences Is subtraction logic valid here? What else could differ between objects and textures? Objects > Textures object shapes irregular shapes familiarity –namability visual features (e.g., brightness, contrast, etc.) actability attention-grabbing
Other Subtractions Lateral Occipital Complex Visual Cortex (V1) Malach et al., 1995, PNAS > > > Grill-Spector et al., 1998, Neuron Kourtzi & Kanwisher, 2000, J Neurosci
Dealing with Attentional Confounds fMRI data seem highly susceptible to the amount of attention drawn to the stimulus or devoted to the task. Add an attentional requirement to all stimuli or tasks. How can you ensure that activation is not simply due to an attentional confound? Time Example: Add a one back task subject must hit a button whenever a stimulus repeats the repetition detection is much harder for the scrambled shapes any activation for the intact shapes cannot be due only to attention Other common confounds that reviewers love to hate: eye movements motor movements
Change only one thing between conditions! As in Donders method, in functional imaging studies, two paired conditions should differ by the inclusion/exclusion of a single mental process How do we control the mental operations that subjects carry out in the scanner? i)Manipulate the stimulus works best for automatic mental processes ii)Manipulate the task works best for controlled mental processes DONT DO BOTH AT ONCE!!! Source: Nancy Kanwisher
Beware the Brain Localizer Can have multiple comparisons/baselines Most common baseline = rest In some fields the baseline may be straightforward –For example, in vision studies, the baseline is often fixation on a point on an otherwise blank screen Be careful that you dont try to subtract too much Reaching – rest = visual stimulus + localization of stimulus + arm movement + somatosensory feedback + response planning + … Our task activated the occipito-temporo-parieto-fronto-subcortical network Another name for this is the brain!
What are people doing during rest? What are people really doing during rest? Daydreaming, thinking –Gawd this is boring. I wonder how long Ive been in here. I went at 2:00. It must be about 3:30 now… Remembering, imagining –I gotta remember to pick up a carton of milk on the way home Attending to bodily sensations –I really have to pee!, My back hurts, Get me outta here! Getting drowsy –Zzzzzz… I only closed my eyes for a second… really!
Problems with a Rest Baseline? For some tasks (e.g., memory studies), rest is a poor, uncontrolled baseline –memory structures (e.g., medial temporal lobes) may be DEactivated in a task compared to rest To get a non-memory baseline, some memory researchers put a low-memory task in the baseline condition –e.g., hearing numbers and categorizing them as even or odd Parahippocampal Cortex Stark et al., 2001, PNAS
Default Mode Network red/yellow = areas that tend to be activated during tasks task > resting baseline blue/green = areas that tend to be deactivated during tasks task < resting baseline Fox and Raichle, 2007, Nat. Rev. Neurosci.
More activation for blue than yellow Interpreting Activations vs. Deactivations If negative betas dont make sense for your theory and you included a rest baseline, you can eliminate them with a conjunction analysis + yellow - blue + yellow+ blueAND A rest baseline is needed to discriminate between these two possibilities Rest baseline Stimulus/Task Onset TIME fMRI ACTIVATION (% BSC) More deactivation for yellow than blue
Is concurrent behavioral data necessary? Ideally, a concurrent, observable and measureable behavioral response, such as a yes or no bar-press response, measuring accuracy or reaction time, should verify task performance. -- Mark Cohen & Susan Bookheimer, TINS, 1994 I wonder whether PET research so far has taken the methods of experimental psychology too seriously. In standard psychology we need to have the subject do some task with an externalizable yes-or-no answer so that we have some reaction times and error rates to analyze – those are our only data. But with neuroimaging youre looking at the brain directly so you literally dont need the button press… I wonder whether we can be more clever in figuring out how to get subjects to think certain kinds of thoughts silently, without forcing them to do some arbitrary classification task as well. I suspect that when you have people do some artificial task and look at their brains, the strongest activity youll see is in the parts of the brain that are responsible for doing artificial tasks. -- Steve Pinker, interview in the Journal of Cognitive Neuroscience, 1994 Source: Nancy Kanwisher
Parameters for Neuroimaging You decide: number of slices slice orientation slice thickness in-plane resolution (field of view and matrix size) volume acquisition time (usually = TR) length of a run number of runs duration and sequence of epochs within each run counterbalancing within or between subjects Your physicist can help you decide: pulse sequence (e.g., gradient echo vs. spin echo) k-space sampling (e.g., echo-planar vs. spiral imaging) TR, TE, flip angle, etc.
Tradeoffs Number of slices vs. volume acquisition time the more slices you take, the longer you need to acquire them e.g., 30 slices in 2 sec vs. 45 slices in 3 sec fMRI is like trying to assemble a ship in a bottle – every which way you try to move, you encounter a constraint -- Mel Goodale Number of slices vs. in-plane resolution the higher your in-plane resolution, the fewer slices you can acquire in a constant volume acquisition time e.g., in 2 sec, 7 slices at 1.5 x 1.5 mm resolution (128 x 128 matrix) vs. 28 slices at 3 mm x 3 mm resolution (64 x 64 matrix)
More Power to Ya! Statistical Power the probability of rejecting the null hypothesis when it is actually false if theres an effect, how likely are you to find it? Effect size bigger effects, more power e.g., LO localizer (intact vs. scrambled objects) -- 1 run is usually enough looking for activation during imagery of objects might require many more runs Sample size larger n, more power more subjects longer runs more runs per subject Signal:Noise Ratio better SNR, more power higher magnetic field multi-channel coils fewer artifacts (physical noise, physiological noise)
Put your conditions in the same run! Why? subjects get drowsy and bored magnet may have different amounts of noise from one run to another (e.g., spike) some stats (e.g., z-normalization) may affect stats differently between runs By this logic, there is higher activation for Places than Faces in the data to the left. Do you agree? Bottom line: If you want to compare A vs. B, compare A vs. B! Simple, eh? As far as possible, put the two conditions you want to compare within the same run. Common flawed logic: Run1: A – baseline Run2: B – baseline A – 0 was significant, B – 0 was not, Area X is activated by A more than B Faces Places Error bars = 95% confidence limits BOLD Activation (%)
Run Duration How long should a run be? Short enough that the subject can remain comfortable without moving or swallowing Long enough that youre not wasting a lot of time restarting the scanner My ideal is ~6 ± 2 minutes
Simple Example Experiment: LO Localizer Intact Objects Scrambled Objects Blank Screen TIME One volume (12 slices) every 2 seconds for 272 seconds (4 minutes, 32 seconds) Condition changes every 16 seconds (8 volumes) Lateral Occipital Complex responds when subject views objects (Unit: Volumes)
Options for Block Design Sequences That design was only one of many possibilities. Lets consider some of the other options and the pros and cons of each. Lets assume we want to have an LO localizer We need at least two conditions: but we could consider including a third condition Lets assume that in all cases we need 2 sec/volume to cover the range of slices we require Lets also assume a total run duration of 136 volumes (x 2 sec = 272 sec = 4 min, 16 sec Well start with 2 condition designs…
Convolution of Single Trials Neuronal Activity Haemodynamic Function BOLD Signal Time Slide from Matt Brown
Block Design: Short Equal Epochs Alternation every 4 sec (2 volumes) signal amplitude is weakened by HRF because signal doesnt have enough time to return to baseline not to far from range of breathing frequency (every 4-10 sec) could lead to respiratory artifacts if design is a task manipulation, subject is constantly changing tasks, gets confused HRF- convolved time course raw time course Time (2 s volumes)
Block Design: Short Unequal Epochs 4 sec stimuli (2 volumes) with 8 sec (4 volumes) baseline weve gained back most of the HRF-based amplitude loss but the other problems still remain now were spending most of our time sampling the baseline HRF- convolved time course raw time course Time (2 s volumes)
Block Design: Long Epochs The other extreme… Alternation Every 68 sec (34 volumes) more noise at low frequencies linear trend confound subject will get bored very few repetitions – hard to do eyeball test of significance HRF- convolved time course raw time course Time (2 s volumes)
Find the Sweet Spots Respiration every 4-10 sec (0.3 Hz) moving chest distorts susceptibility Cardiac Cycle every ~1 sec (0.9 Hz) pulsing motion, blood changes Solutions gating avoiding paradigms at those frequencies You want your paradigm frequency to be in a sweet spot away from the noise
Block Design: Medium Epochs Every 16 sec (8 volumes) allows enough time for signal to oscillate fully not near artifact frequencies enough repetitions to see cycles by eye a reasonable time for subjects to keep doing the same thing HRF- convolved time course raw time course Time (2 s volumes)
Block Design: Other Niceties If you start and end with a baseline condition, youre less likely to lose information with linear trend removal and you can use the last epoch in an event related average truncated too soon Time (2 s volumes)
Block Design Sequences: Three Conditions Suppose you want to add a third condition to act as a more neutral baseline For example, if you wanted to identify visual areas as well as object-selective areas, you could include resting fixation as the baseline. That would allow two subtractions –scrambled - fixation visual areas –intact - scrambled object-selective areas That would also help you discriminate differences in activations from differences in deactivations Now the options increase. For simplicity, lets keep the epoch duration at 16 sec.
Block Design: Repeating Sequence We could just order the epochs in a repeating sequence… Problem: There might be order effects Solution: Counterbalance with another order Problem: If you lose a run (e.g., to head motion), you lose counterbalancing)
Block Design: Random Sequence We could make multiple runs with the order of conditions randomized… Problem: Randomization can be flukey Problem: To avoid flukiness, youd want to have different randomization for different runs and different subjects, but then youre going to spend ages defining protocols for analysis
Block Design: Regular Baseline We could have a fixation baseline between all stimulus conditions (either with regular or random order) Benefit: With event-related averaging, this regular baseline design provides nice clear time courses, even for a block design Problem: Youre spending half of your scan time collecting the condition you care the least about
But I have 4 conditions to compare! Here are a couple of options. A. Orderly progression Pro: Simple Con: May be some confounds (e.g., linear trend if you predict green&blue > pink&yellow) B. Random order in each run Pro: order effects should average out Con: pain to make various protocols, no possibility to average all data into one time course, many frequencies involved
C. Kanwisher lab clustered design sets of four main condition epochs separated by baseline epochs each main condition appears at each location in sequence of four two counterbalanced orders (1 st half of first order same as 2 nd half of second order and vice versa) – can even rearrange data from 2 nd order to allow averaging with 1 st order Pro: spends most of your n on key conditions, provides more repetitions Con: not great for event-related averaging because orders are not balanced (e.g., in top order, blue is preceded by the baseline 1X, by green 2X, by yellow 1X and by pink 0X. As you can imagine, the more conditions you try to shove in a run, the thornier ordering issues are and the fewer n you have for each condition.
But I have 8 conditions to compare! Just dont. In my experience, any block design experiment with more than four conditions becomes unmanageable and incomprehensible Event-related designs might still be an option… stay tuned…
Prepare Well: Subjects recruit and screen your subjects well in advance –safety screening best to let them read through and self-screen beforehand so you dont get any embarrassing situations (e.g., discussions about IUDs, pregnancy) –eye glasses –handedness make sure your subjects know how to be good subjects –http://www.ssc.uwo.ca/psychology/culhamlab/Jody_web/Subject_Info /firsttime_subjects.htmhttp://www.ssc.uwo.ca/psychology/culhamlab/Jody_web/Subject_Info /firsttime_subjects.htm make sure you and the subjects can contact each other in case of problems or delays if possible, be a subject yourself to see what the pitfalls and strategies might be remember to bring: –subject fees (and receipt book) –consent and screening forms
Prepare Well: Experiments test all equipment in advance test software under realistic circumstances (same computer, timing and duration as fMRI experiments) make sure you know all of the parameters the technician will want (e.g., pulse sequence, timing, slices and orientation) at RRI, prepare a spreadsheet with mouseclicks and stopwatch times check the timing as you go, especially at the beginning of an experiment keep accurate log notes as you go check with the technician regularly to ensure that your log notes record the same run number as the scanner attach your timing spreadsheet to the log notes for that subject write down any problems that arose (e.g., subject missed second last trial; subject drowsy through first ~third of run)
Prepare Well: Postprocessing move data to secure location as soon as possible save one backup in the rawest form possible –if advances in reconstruction occur, you will need unprocessed data to use them save other backups at natural points (e.g., backup and delete 2D data once youve made 3D data) –have redundancy –dont put all backups on the same CD/DVD or youre toast if one is damaged (CDs arent forever like we once thought) save full projects to one DVD (or HD partition) once youre done so you can reload an entire project if you need to reanalyze keep a subject archive …
Dealing with Frustration Sign that used to be at the 1.5 T at MGH Murphy's law acts with particular vigour in fMR imaging: Number of pieces of equipment required in an fMRI experiment: ~50 Probability of any one piece of equipment working in a session: 95% Probability of everything working in a session: 0.95^50 = 7.6% Solution for a good imaging session = $4 million magnet + $3 roll of duct tape
How NOT to do an imaging experiment ask a stupid question –e.g., I wonder what lights up for nose picking vs. rest compare poorly-defined conditions that differ in many respects use a paradigm from another technique (e.g., cognitive psychology) without optimizing any of the timing for fMRI, e.g., 1 minute epochs be naively optimistic –go straight for the whipped cream experiment without starting with a sledgehammer experiment never look at raw data, time courses or individual data, just plunk it all into one big stat model and look at what comes out publish a long list of activated foci in every possible comparison dont use any statistical corrections write a long discussion on why your task activates the subcortico- occipito-parieto-temporo-frontal network