Presentation on theme: "Preprocessing for EEG & MEG"— Presentation transcript:
1Preprocessing for EEG & MEG Tom Schofield & Ed Roberts
2Data acquisitionIn this experiment, the subject is being presented with letters on a screen. Mainly Xs, but sometimes the occasional O will appear.The EEG is continually recorded from the subject by the digitization computer. It is amplified during the recording process, and often filtered in order to remove very high and very low frequency activity.Stimulus PC presents stimulus to subject. As it presents each stimulus, the stimulus PC sends a pulse to the recording computer through a parallel port connection, telling it that a particular stimulus has been presented at a particular time
3Data acquisition Using Cogent to a generate marker pulse.. drawpict(2);outportb(888,2);tport=time;waituntil(tport+100);outportb(888,0);logstring( [‘displayed ‘O’ at time ' num2str(time) ]);This marker pulse is recorded in the data, marking when each stimulus is presented, and what type of stimulus it is.This is how you generate it with cogent. In this example the ‘drawpict’ command in the script tells the stim PC to display picture ‘2’ and the ‘outportb’ command tells the recording computer to make a record in the data stream that picture ‘2’ was displayed at this time
4Two crucial stepsActivity caused by your stimulus (ERP) is ‘hidden’ within continuous EEG streamERP is your ‘signal’, all else in EEG is ‘noise’Event-related activity should not be random, we assume all else isEpoching – cutting the data into chunks referenced to stimulus presentationAveraging – calculating the mean value for each time-point across all epochsThe EEG data is recorded continuously, if the activity generated in response to the ‘X’ or the ‘O’ is the ‘signal’ that you want, then the EEG activity not related to presentation of these stimuli can be characterised as ‘noise’.If we assume that this noise is randomly occurring, then by epoching our data – by splitting it up into single trials referenced to stimulus presentation and then averaging these epochs together, point-by-point, the noise should cancel out and the activity related to the stimulus should become visible
5Extracting ERP from EEG ERPs emerge from EEG as you average trials togetherHere’s an example of this approach. On the left there are single EEG epochs. 4 epochs where an X was presented, 2 where an O was presented. All the epochs are different, there’s not much of a pattern apparent. But if you look at what happens when we start to average them together we start to see things more clearly. On the right on the top is the result if you average 80 x trials together. On the right underneath is what happens if you average 20 O trials together. As you can see, stimulus specific waveforms seem to be emerging. The most striking difference between the two conditions in this case is in the magnitude of the P3 component. If this difference is large enough, and there are enough trials, you’d probably be able to find a statistically significant difference between the two types of trial.
6Overview Preprocessing steps Preprocessing with SPM What to be careful aboutWhat you need to know about filtering
7First step. Raw data from EEG or MEG needs to be put into format suitable for SPM to read. So in SPM, you select ‘Convert’.Tell SPM to expect data generated by a particular type of system by selecting from a list that pops up. Systems used at the FIL are; BDF for EEG and CTF for MEG.Select the raw data file.Select the data template file which contains information about the spatial locations of the electrode positionYou then tell SPM if you want to read the whole file in. This is because if you’re recording MEG data at quite high sample rates, MATLAB can only handle data from about 15 minutes worth of recording. If you try to convert the whole thing and it crashes, you can try reading in half of it at a time.Creates a .mat and a .dat file. The .mat file contains information about the structure of the data. The .dat file contains the data itself.mydata.mat
8EpochingEpoching splits the data into single trials, each referenced to stimulus presentationFor each stimulus onset, the epoched trial starts at some user-specified pre-stimulus time and ends at some post-stimulus time, e.g. from 100 ms before to 600ms after stimulus presentation.You have to be careful with specifying your epoch timing. The epoched data is usually baseline corrected, the mean value in pre-stimulus time is subtracted from the whole trial. Essentially, you want your pre-stimulus interval to only contain noise. If there’s any activity related to the previous stimulus still around, it will be removed from the current epoch and you’ll distort your data.
9Epoching - SPM Creates: e_mydata.mat So in SPM you select ‘Epoch’ You tell it how long post stimulus you want your epoch, and you tell it how long you want your baseline period to be.Tell it which marker codes to look out for in order to segregate the trial types.If you don’t have marker codes in your data, you can read in a list of times at which your events happenedYou end up with a new .mat file which has an ‘e’ attached to the front of it.Creates: e_mydata.mat
10DownsamplingNyquist Theory – minimum digital sampling frequency must be > twice the maximum frequency in analogue signalSelect ‘Downsample’ from the ‘Other’ menuRecording MEG/EEG involves converting the analogue signal from the brain into a series of digital representations. The EEG/MEG does not really consist of continuous data. Data are sampled at a rate specified before recording e.g. 400 samples per second. A typical EEG consists of a sequence of thousands of discrete values, one after the other.You have to sample at a high frequencies to get a good quality digital conversions of analogue signals. The minimum sampling frequency needs to be greater than twice the maximum frequency of any analogue signal likely to be present in the EEG. This is called the Nyquist frequency. In this figure, the top picture shows well sampled data. Each dot is a digital recording of the value of the wave at each time point. Underneath, there’s an example of under-sampling. The sampling is too sparse to capture the nature of the analogue wave. With a sampling frequency that’s too low, a signal of lower frequency is generated -- this phenomenon is known as aliasingThis means you are probably going to end up sampling your data at a higher resolution than you actually need to capture you components of interest. Once your data is safely digitised, you can downsample it so that it takes up less room and is quicker to work with.
11Downsample Creates: de_mydata.mat SPM uses the matlab function RESAMPLE to downsample your data. You select ‘Downsample’ from the ‘Other’ menu and tell it the new sampling rate you want. And it creates a new .mat file with a ‘d’ appended to it.Creates: de_mydata.mat
12Artefact rejection Blinks Eye-movements Muscle activity EKG Skin potentialsAlpha wavesSome epochs will be contaminated with activity that does not come from brain processes.Major artefacts are – eyeblinks, eve-movements, muscle activity, changes in skin impedance, alpha waves. These will all contaminate your data to some extent.Within each eye there is an electrical gradient, positive at the front. Movement of the eyelids across the eye modulates the conduction of this electrical potential of the eyes to the surrounding regions.Moving the eye itself actually changes the electrical gradients observable at the scalp, making them more positive in the direction the eye has moved towards. Also moves stimulus across retina, this generates unwanted visual ERPs.Muscle activity is characterised by bursts of high frequency activity in the EEG.EKG reflects activity of the heart. This can intermittently appear in your dataSkin potentials are only really a problem for EEG. They reflect changes in skin impedence, this could happen if the subject starts to sweat more as the experiment progresses. They look like slow drifts.. Sometimes if they drift far enough, they cause saturation of the amplifier - a flat ‘blocking’ line.Alpha waves are actually generated by the brain. They’re slow repetitive waves of about 10 hz which are unlikely to be of any interest to you if you’re looking at cognitive processes.
13Artefact rejection Blinks Eye-movements Muscle activity EKG Skin potentialsAlpha wavesSo are these artefacts going to be a big problem if they appear in our data?Short answer is that we don’t really have to worry about muscle activity. Low-pass filter will deal with muscle artefact.Heart artefact is in a similar frequency range to ERP components, so can’t be dealt with by filtering. Luckily, usually don’t really have to worry about it as it shouldn’t appear systematically - will come and go – probably just increase overall noise levels.Same holds for alpha waves, unless individual is particularly sleepy.Skin potentials only appear with EEG. Again, don’t really need to worry about them as they are usually rare, slow and random. Will just add a bit of noise.This is an important point – assuming you have a high number of trials, and your artefact is not systematically stimulus-linked, then a simple averaging procedure is surprisingly good at eliminating artefact.Biggest problems usually are eye-blinks and eye movements. In a visual paradigm, for example, the chances are that these will be stimulus-linked. How do we deal with them?One way to do this is by rejecting epochs which you think are contaminated by artefact. The problem here lies in recognising when things like eye-blinks have occurred. Luckily most artefactual activity is different in pattern from brain related activity, usually of a larger amplitudeThis means the simple method of thresholding can be used to reject epochs in which the EEG reading shows an amplitude of above a certain amount. Alternatively, you can use eye-tracking equipment to record eye events as they occur during the experiment, and then mark these epochs as contaminated.
14Artefact rejection - SPM There are two ways to perform artefact rejection with SPM. You can use thresholding to tell it to reject every trial which contains an absolute value exceeding a certain amount. Or you can give it a list of trials which you know to be contaminated by artefact.You select ‘artefact’, click yes to give it a list of contaminated trials, or click no just to threshold all trials. Then click ‘yes’ to ‘threshold channels?’ and then type in the threshold you want to use. To decide on the threshold, you could try looking at your epoched data and picking a value that seems sensible. This marks as bad all epochs that contain super-threshold activity and generates a new .mat file with an ‘a’ at the front of it.Unfortunately thresholding isn’t really a very sensitive way of detecting artefact. If it turns out not to be good enough for you – might have to write your own matlab script to run on your data outside SPM. You might tell it to look for peak-to-peak differences within an epoch of above a certain value, for example. Then you could read this list into SPM.Or if you’re really worried about alpha waves contaminating your data, for example, you could ask matlab to look at each epoch in turn, perform a Fourier transform on it to give you a measure of the frequency profile of the epoch, and tell SPM to reject the entire epoch if the activity at 10hz is above a certain amount.Creates: ade_mydata.mat
15Artefact correction Rejecting ‘artefact’ epochs costs you data Using a simple artefact detection method will lead to a high level of false-positive artefact detectionRejecting only trials in which artefact occurs might bias your dataHigh levels of artefact associated with some populationsAlternative methods of ‘Artefact Correction’ existThere are problems with the Artefact rejection approach. The main problem is that if you throw out all trials that might possibly contain artefact, you’ll end up with smaller datasets.Additionally, the data that remains might well be from an unrepresentative sample of trials.If you want to look at children, patients etc., you can expect a higher proportion of trials to be contaminated by artefact - and you’ll probably be collecting less data than you would with controls anyway. If you reject all artefact epochs in these cases, you’ll be lucky to have enough trials left to extract an ERPIn some cases, rather than throw out epochs which you think contain artefact, it might be a better idea to try to correct for it instead.
16Artefact correction - SPM SPM uses a robust average procedure to weight each value according to how far away it is from the median value for that timepointWeightingValueOutliers are given less weightArtefact correction methods also attempt to detect artefact, but instead of simply rejecting the whole epoch, they attempt to estimate the relative size of the artefact and correct it in the data This will leave more trials, therefore better S/N – the problem is that these corrections themselves can cause significant distortion if they are incorrectly estimated.SPM uses a Robust Averaging Paradigm.For each time point, the median value across all epochs is calculated. Then each data point at that time point is weighted according to how far away from the median it is.Those within a certain range are weighted ‘1’, those further away are weighted lower, down to zero. The acceptable range varies according to how tightly distributed the points are about the median.This procedure is run again, using the new weighted values to calculate the median, and the process iterates until the weighting values become constant.In the picture here you can see the results of this process – Red points have been weighted ‘1’, as points get further away from the centre, they receive less weight.These weighting values are written in the .mat file, to be used later when you average your data.As it uses median values rather than mean values, it’s far more robust to outliers – which can have a disproportional effect on mean values.Points close to median weighted ‘1’
17Artefact correction - SPM Normal averageRobust Weighted AverageHere’s an average waveform derived from the data on the previous slide.In red is what we get if we just average the values together without trying to identify outliers.In blue is what we get if we use the weighting values for each data-point in our averageAs you can see, several small artefactual peaks are eliminated by the procedure
18Robust averaging - SPM Creates: ade_mydata.mat To perform this, you select ‘Artefact’, click ‘No’ when it asks if you want to read in your own artefact listThen select robust averageThen select how strict you want it to be in its weighting. 3 is the default.Then select the default smoothing option.This will create a new .mat file with an ‘a’ in front of it.Creates: ade_mydata.mat
19Artefact Correction ICA Linear trend detection Electro-oculogram ‘No-stim’ trials to correct for overlapping waveformsOther popular methods of artefact correction, these aren’t used by SPM so I won’t talk too much about them.ICA for eyeblinks. This separates your data into what it thinks are the components driving your data. You then visually inspect each of these components and pick the one you think might represent eyeblinks. Your data is then rebuilt without this component.Linear trend detection looks for slow drifts in your data and tries to remove these trends from your dataEOG technique records activity from electrodes placed under the eyes. This is assumed to give a good estimate of eye movements and blinks. A fraction of this value is then removed from the EEG to compensate for blinks.It’s not strictly artefact, but If you’re looking at long latency waves, you might end up with substantial overlap between trials. If in your design you include trials where there are no stimuli, you can average together only these ‘no-stim’ trials. Assuming these trials only contain activity from the preceding stimulus, you’ll then get an ‘overlap’ wave that you can subtract from your other epochs.
20Artefact avoidance Blinking EMG Alpha waves Avoid contact lenses Build ‘blink breaks’ into your paradigmIf subject is blinking too much – tell themEMGAsk subjects to relax, shift position, open mouth slightlyAlpha wavesAsk subject to get a decent night’s sleep beforehandHave more runs of shorter length – talk to subject in betweenJitter ISI – alpha waves can become entrained to stimulusIn practice, there are problems with both artefact rejection and artefact correction. It’s always best to try to minimize artefact in the first place.Artefact from eye-movements becomes less important over areas further way from the eyes. With an auditory paradigm you might be justified in ignoring these artefacts.
21Averaging R = Noise on single trial N = Number of trials Noise in avg of N trials(1/√N) x RMore trials = less noiseDouble S/N need 4 trialsQuadruple need 16 trialsOnce you’ve rejected or corrected the artefact in your data. You need to extract your ERP. Generally you’ll just perform a simple average of each point, or use the Robust averaging data to perform a weighted average.We assume that the ERP ‘signal’ is the same on all trials, and unaffected by the averaging process. And assume as we said earlier, that the noise is random.On there left are 8 single trial EEG epochs. On the right is what happens as we average each trial together. We can see that as we add each trial to the average, the resulting waveform becomes more consistent.S/N ratio actually increases as a function of the square root of the number of trials. So in practice, you need a huge number of trials to extract your signal. As a general rule, it’s always better to try to decrease sources of noise than it is to increase the number of trials.
22Averaging Creates: made_mydata.mat Averaging is simple. You just click ‘average’, then select the file you want to perform the average on.This will give you a new .mat file with a ‘m’ at the front of itCreates: made_mydata.mat
23Averaging Assumes that only the EEG noise varies from trial to trial But – amplitude will varyBut – latency will varyVariable latency is usually a bigger problem than variable amplitudeExtracting ERPs from EEGs in this manner relies on few assumptions that aren’t strictly true.Perhaps the biggest false assumption is that the ERP remains constant for each trial. We also assume that the noise in the EEG signal varies randomly across the experiment. Neither of these assumptions are completely true.Any two ERPs elicited by the same stimuli will vary from each other in both the peak amplitude of some components of the ERP, and in the latency of these components.Variations in peak amplitude can be quite large – but in the end you’ll still get an ERP which accurately reflects the average amplitude of each component. Variations in latency have a more profound affect on the averaged waveform.
24Averaging: effects of variance Averaging ERPs that vary in latency can give you a severely unrepresentative wave form.In general, the greater the variation in onset, the flatter and more spread out the resulting waveformAt the bottom left, the latency varies a little between trials, and we see that average wave is a lot squatter than any of the single trial waves.With greater latency differences, as seen in the top left, the problem gets much worse.Notice also that the onset and offset of the mean waveform are not the average onset and offset times of the underlying events.The onset in the average wave is the earliest of all onsets from all epochs – the offset is the latest of all offsets.The problem is the greatest when we’re dealing with multiphasic waveforms that differ in phase. In the worst case, the resulting average wave might be flat – all information is lost in the averaging procedure as negative phase in some trials occurs at the same peri-stimulus time as positive phase in others.Latency variation can be a significant problem
25Latency variation solutions Don’t use a peak amplitude measureSimplest solution to this problem is to measure your waveforms differently.Peak amplitude is poor measure to use in this sort of situation.As an alternative, you could measure the area under the curves.In both of the examples here, the area under the average curve is equal to the average area under the single trials curves.This won’t give you a latency measurement, however.If you draw a line which bisects the average curve, where there is 50% of the area on either side of the line, as shown in red on the figures, then this line will give you an average peak latency measurement for the single trial curves.Neither of these approaches will work for the multiphasic wave we saw earlier, however, as in this case averaging throws away all the information present in the single trials.
26Time Locked Spectral Averaging This is a method of extracting information from waves that vary in latency and phase.Essentially, you use something called a wavelet to generate a map of the frequency activity that’s present in your data (what a wavelet is this will be covered next week). This shows information regardless of phase differences. In the plots show, the lighter parts represent frequencies that are particularly well represented at the timepoints on the x axis. On the left is a plot generated from looking at each trial individually, and then combining all the individual frequency maps. We can see that there’s a lot of 40hz activity at about 100ms, and also some at 300 ms.The plot on the right is a time frequency map of the waveform generated after all epochs have been averaged together. The 40hz activity at 100ms is still there, but the activity at 300ms has disappeared, suggesting that this activity might reflect a multiphasic ERP component that appears at 300ms.
27Other stuff you can do – all under ‘Other’ in GUI Merge data sessions togetherCalculate a ‘grand mean’ across subjectsRereference to a different electrodeFILTER
28Why would you want to filter? FilteringWhy would you want to filter?
29Potential Artefacts Before Averaging… Remove non-neural voltages Sweating, fidgetingPatients, ChildrenAvoid saturating the amplifierFilter at 0.01Hz
30Potential Artefacts After Averaging… Filter Specific frequency bands Remove persistent artefactsSmooth data
31Types of Filter Low-pass – attenuate high frequencies High-pass – attenuate low frequenciesBand-pass – attenuate bothNotch – attenuate a narrow band
32Properties of Filters “Transfer function” Effect on amplitude at each frequencyEffect on phase at each frequency“Half Amp. Cutoff”Frequency at which amp is reduced by 50%Half amp. Cutoff, frequency at which amplitude is reduced by 50%, or in terms of power; power = 50%, when amp = 71%
33High-passDiminishes the larger peak due to filtering out the lower frequency componentsArtefactual peaks introduced into the higher frequency components.
34Low-passRemoves high frequency 30Hz + noise, gamma etc
35Band-pass and NotchBand pass useful for selecting a band of frequencies, e.g. if you wanted to purely examine Beta or Theta oscillations.Notch useful for removing a specific frequency e.g. 50Hz mains supply, or local interference source.
36Problems with Filters Original waveform, band pass of .01 – 80Hz Low-pass filtered, half-amp cutofff = ~40HzLow-pass filtered, half-amp cutofff = ~20HzLow-pass filtered, half-amp cutofff = ~10HzAlthough the waveform at the bottom looks the smoothest, and perhaps nicest, it now contains very little information and doesn’t resemble the original very much.
37Filtering Artefacts“Precision in the time domain is inversely related to precision in the frequency domain.”The sharp cut-off in the filter leads to distortion in the waveform, a change in the onset time, and extra oscillations which were not previously present.A sharp cut-off would seem ideal, and specific, but in reality they cause more problems than they solve.
38Filtering in the Frequency Domain BCADEThe 60 Hz component is attenuated and then reverse Fourier-transformed to return to the original waveform.
39Filtering in the Time Domain Filtering in the time domain is analogous to smoothingAt a given point an average is calculated in relation to two nearest neighbours or moreX-1XX+1Smooth by averaging with surrounding points.
40Filtering in the Time Domain Waveform progressively filtered by averaging the surrounding time points.Here x = ((x-1)+x+(x+1))/3The data in the bottom right plot is derived from the taking the smoothest curve away from the original, thus giving the high frequency noise
41Recipe for Preprocessing Band-pass filter e.g.0.1 – 40HzEpochCheck/ViewMergeDownsample?Artefacts; Correction/RejectionFilterAverage
42Recommendations Prevention is better than the cure During amplification and digitization minimize filteringKeep offline filtering minimal, use a low-passAvoid high-pass filteringClean dataDuring Amp and digitization process, minimise filtering, avoid notch filtersMinimise offline filtering, maybe just use a low-pass filter to clean dataAvoid high-pass filters, only occasionally useful, and nearly always problematic.
43Summary No substitute for good data The recipe is only a guideline CalibrateFilter sparinglyBe prepared to get your hands dirtyGood data solves all your problemsUse this as a guideline, it varies with experiment, personal judgement to find an appropriate balance is necessary. Over processed data is not necessarily comparable with other peoples’ data sets.Calibration lets you know exactly what you are doing to your data, introducing phase shifts etc.Less is more.Batch scripts are available to do a lot of the processing, but writing a few lines of code in matlab will make sure you are in complete control. SPM is not 100% on this yet…
44ReferencesAn Introduction to the Event-related Potential Technique, S. J. LuckSPM Manual