Optimizing Channel Selection for Seizure Detection

Slides:



Advertisements
Similar presentations
M. Tabrizi: Seizure Detection May, A COMPARATIVE ANALYSIS OF NONLINEAR FEATURES FOR AN HMM-BASED SEIZURE DETECTION SYSTEM Masih Tabrizi and Joseph.
Advertisements

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Abstract Overview Analog Circuit EEG signals have magnitude in the microvolt range. A much larger voltage magnitude is needed to detect changes in the.
Introduction The aim the project is to analyse non real time EEG (Electroencephalogram) signal using different mathematical models in Matlab to predict.
Corpus Development EEG signal files and reports had to be manually paired, de-identified and annotated: Corpus Development EEG signal files and reports.
Standard electrode arrays for recording EEG are placed on the surface of the brain. Detection of High Frequency Oscillations Using Support Vector Machines:
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Abstract EEGs, which record electrical activity on the scalp using an array of electrodes, are routinely used in clinical settings to.
Automatic Labeling of EEGs Using Deep Learning M. Golmohammadi, A. Harati, S. Lopez I. Obeid and J. Picone Neural Engineering Data Consortium College of.
ERP DATA ACQUISITION & PREPROCESSING EEG Acquisition: 256 scalp sites; vertex recording reference (Geodesic Sensor Net)..01 Hz to 100 Hz analogue filter;
Data Processing Machine Learning Algorithm The data is processed by machine algorithms based on hidden Markov models and deep learning. They are then utilized.
Abstract The emergence of big data and deep learning is enabling the ability to automatically learn how to interpret EEGs from a big data archive. The.
1. 2 Abstract - Two experimental paradigms : - EEG-based system that is able to detect high mental workload in drivers operating under real traffic condition.
Views The architecture was specifically changed to accommodate multiple views. The used of the QStackedWidget makes it easy to switch between the different.
THE TUH EEG CORPUS: A Big Data Resource for Automated EEG Interpretation A. Harati, S. López, I. Obeid and J. Picone Neural Engineering Data Consortium.
Experimental Results ■ Observations:  Overall detection accuracy increases as the length of observation window increases.  An observation window of 100.
Data Acquisition An EEG measurement represents a difference between the voltages at two electrodes. The signal is usually displayed using a montage which.
The Goal Use computers to aid physicians in diagnosis of neurological diseases, particularly epilepsy Detect pathological events in real time Currently,
TUH EEG Corpus Data Analysis 38,437 files from the Corpus were analyzed. 3,738 of these EEGs do not contain the proper channel assignments specified in.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Abstract Automatic detection of sleep state is important to enhance the quick diagnostic of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Demonstration A Python-based user interface: Waveform and spectrogram views are supported. User-configurable montages and filtering. Scrolling by time.
EEG processing based on IFAST system and Artificial Neural Networks for early detection of Alzheimer’s disease.
Views The architecture was specifically changed to accommodate multiple views. The used of the QStackedWidget makes it easy to switch between the different.
Abstract Automatic detection of sleep state is an important queue in accurate detection of sleep conditions. The analysis of EEGs is a difficult time-consuming.
Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.
Improved EEG Event Classification Using Differential Energy A.Harati, M. Golmohammadi, S. Lopez, I. Obeid and J. Picone Neural Engineering Data Consortium.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions and to generate massive data sets.
Constructing Multiple Views The architecture of the window was altered to accommodate switching between waveform, spectrogram, energy, and all combinations.
Constructing Multiple Views The architecture of the window was altered to accommodate switching between waveform, spectrogram, energy, and all combinations.
Results from Mean and Variance Calculations The overall mean of the data for all features was for the REF class and for the LE class. The.
Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.
Human Language Technology Research Institute
Scalable EEG interpretation using Deep Learning and Schema Descriptors
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.
Compact Bilinear Pooling
[Ran Manor and Amir B.Geva] Yehu Sapir Outlines Review
Computer Science and Engineering, Seoul National University
CLASSIFICATION OF SLEEP EVENTS IN EEG’S USING HIDDEN MARKOV MODELS
G. Suarez, J. Soares, S. Lopez, I. Obeid and J. Picone
Enhanced Visualizations for Improved Real-Time EEG Monitoring
Epileptic Seizure Prediction
Natural Language Processing of Knee MRI Reports
Schizophrenia Classification Using
Urban Sound Classification with a Convolution Neural Network
Machine Learning Feature Creation and Selection
Enhanced Visualizations for Improved Real-Time EEG Monitoring
N. Capp, E. Krome, I. Obeid and J. Picone
Deep Learning Hierarchical Representations for Image Steganalysis
EEG Recognition Using The Kaldi Speech Recognition Toolkit
GATED RECURRENT NETWORKS FOR SEIZURE DETECTION
Big Data Resources for EEGs: Enabling Deep Learning Research
AN ANALYSIS OF TWO COMMON REFERENCE POINTS FOR EEGS
E. von Weltin, T. Ahsan, V. Shah, D. Jamshed, M. Golmohammadi, I
Vinit Shah, Joseph Picone and Iyad Obeid
Automatic Interpretation of EEGs for Clinical Decision Support
Dynamic Causal Modelling for M/EEG
feature extraction methods for EEG EVENT DETECTION
LECTURE 42: AUTOMATIC INTERPRETATION OF EEGS
LECTURE 41: AUTOMATIC INTERPRETATION OF EEGS
Machine Learning for Visual Scene Classification with EEG Data
EEG Event Classification Using Deep Learning
A Dissertation Proposal by: Vinit Shah
Deep Residual Learning for Automatic Seizure Detection
EEG Event Classification Using Deep Learning
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Automatic Handwriting Generation
Presentation transcript:

Optimizing Channel Selection for Seizure Detection V. Shah, M. Golmohammadi, S. Ziyabari, E. Von Weltin, I. Obeid and J. Picone Neural Engineering Data Consortium Temple University

Abstract Clinical scalp EEG records contain many types of artifacts that pose serious challenges for machine learning technology. Spatial information contained in the placement of the electrodes can be exploited to accurately detect artifacts. When fewer electrodes are used, less spatial information is available, making it harder to detect artifacts. In this study, we investigate the performance of a deep learning algorithm, CNN/LSTM, on several channel configurations. Each configuration was designed to minimize the amount of spatial information lost compared to a standard 22-channel EEG. Baseline performance of a system that used all 22 channels was 39% sensitivity with 23 false alarms. Systems using a reduced number of channels ranging from 8 to 20 achieved sensitivities between 33% and 37% with false alarms in the range of [38, 50] per 24 hours. False alarms increased dramatically (e.g., over 300 per 24 hours) when the number of channels was further reduced.

Introduction Electroencephalography (EEG) is a popular tool used to diagnose brain related illnesses. The 10-20 system is a universally accepted method for the placement of electrodes for any EEG test or experiment. Six separate channel configurations were selected to optimize the spatial information specific to the application of seizure detection: The TUH EEG Seizure Corpus (TUSZ - v1.1.1) was used in this study: A TCP montage was applied prior to generating features, and standard linear frequency cepstral coefficient (LFCC) features were used. Seizure Corpus Version 1.1.1 Dataset Training set Evaluation set w/seiz Total Patients 71 196 38 50 Sessions 102 456 89 230 Epochs (sec.) 51,140 (5.51%) 928,962 (100.00%) 53,930 (8.96%) 601,649 (100.00%)

Temporal Central Parasagittal (TCP) Montage A TCP montage is a type of longitudinal and transverse bipolar montage. Differential channels, such as FP1-F7 (longitudinal) and C3-CZ (transverse), help in removing static noise and improving spatial information. Transverse Longitudinal

After Application of a TCP Montage Significance of the Ax Channels Artifacts An electrode’s position on the scalp makes it susceptible to specific type of artifacts: Chewing → T5, T6 Head bobbing → O1, O2 Reference channels Ax are usually noisy because they are attached to the patient’s ears. Sampled Data Artifacts Phase Reversal Bipolar montages (TCP) make it easier to differentiate noisy channels from clean signals. Subtraction of adjacent channels removes noise and makes it easier to determine the locality of an event (e.g., phase reversals) After Application of a TCP Montage

Selecting Channel Configurations There are too many combinations to do an exhaustive search. Choose channel configurations based on domain knowledge. Strategy: maximize spatial information for each channel configuration. 22 Channels 20 Channels 16 Channels 8 Channels 4 Channels 2 Channels

Selecting Channel Configurations Two criteria were used to explore channel selection: Maximize the spatial information: The CZ channel, attached to 6 adjacent electrodes, is used in all configurations to maximize the spatial span of captured events. Only one binding of an occipital channel is used because an event occurring on one side is likely to be observed on the other side of the hemisphere due to the way the occipital lobe functions. Subject matter expertise (e.g., seizure detection): Frontal Polar (FPx) channels are ignored when reducing the number of channels because only 36% of frontal lobe seizures can be observed on scalp EEGs.

Feature Extraction Features are calculated from the signal using a window of 0.2 seconds and a frame of 0.1 seconds. Nine base features comprised of frequency domain energy, 1st through 7th cepstral coefficients, and a differential energy term are computed. Using these base features, first and second derivative features are calculated, forming feature vectors of dimension 26.

A Hybrid DNN Model – CNN/LSTM Each Input tensor contains a 21 sec. long window (210 frames) in our optimized CNN/LSTM model. Convolutional Neural Network (CNN) layers are able to learn the spatial information considering the correlation within the adjacent channels. Long Short Term Memory (LSTM) layers are able to learn the temporal information. A max pooling function is added after each CNN layer to reduce the dimensionality of the input tensor.

Experimental Results The full 22 channels yields the best performance. 2D CNN Layers Sensitivity (%) Specificity (%) FA/24 Hours 22 3 39.15 90.37 22.83 20 34.54 82.07 49.25 16 36.54 80.48 53.99 8 33.44 85.51 38.19 4 33.11 39.32 325.54 2 30.66 88.79 28.57 1 34.09 39.00 332.15 31.15 40.82 308.74 The full 22 channels yields the best performance. 20-, 16- and 8-channel configurations have similar levels of performance. 4- and 2- channel configurations perform poorly due to the lack of spatial information. A max pooling function in the CNN layers reduces the dimensionality of the previous layer to half. This makes it impossible to implement 3 similar CNN layers for 8-, 4- and 2-channel configurations. Alternative methods are to either remove a CNN layer or keep the dimensions of channels intact.

Experimental Results No. Chan. Sensitivity (%) FA/24 Hours w/ Ax w/o Ax 22 20 39.15 34.54 22.83 49.25 18 16 36.65 36.54 37.33 53.99 10 8 30.94 33.44 283.18 38.19 6 4 34.36 34.09 58.15 332.15 2 33.06 31.15 47.53 308.74 Performance of the systems with channels spatially near the reference channels (Ax) is improved. The 4- and 6-channel configurations including Ax perform better because the electrodes near the Ax channels collect additional temporal information. ROC curve depicted shows that system trained on 18 channels (w/ Ax) performs marginally better than system trained on 16 channels (wo/ Ax). Poor performance of the system trained on 10 channels is an example of a bad random initialization seed. DL systems are very vulnerable to such issues.

Summary Maximization of spatial information is an important factor during channel selection: Systems trained on all (22) channel configurations gave the best performance: 39.15% sensitivity and 90.37% specificity with 22.83 Fas per 24 hours. Systems trained and evaluated with referential channels perform better than without referential channels. Network architectures needed to change for the low-order systems (i.e. 2, 4, and 8 channels) because Max pooling layers wouldn’t allow reduction in channel’s dimensions. Future work: Random initialization and shuffling of data play an important role in DL systems. We expect to find better generalization methods which hold less dependence on such parameters. Variation in number of channels required changes in the baseline model. We expect to design a unique model, which will be independent of number of channels. Discovering the best montage for EEG event classification or eliminating the need for a montage using deep learning.