Deep Learning Approaches to Automate Seizure Detection

Deep Learning Approaches to Automate Seizure Detection
Preliminary Exam Department of Electrical and Computer Engineering Deep Learning Approaches to Automate Seizure Detection Submitted to: Dr. Joseph Picone, Dept. of Electrical and Computer Engineering Dr. Iyad Obeid, Dept. of Electrical and Computer Engineering Dr. Mercedes Jacobson, MD, Professor at Dept. of Neurology, Lewis Katz school of medicine Dr. Yimin Zhang, Dept. of Electrical and Computer Engineering Prepared by: Vinit Shah, PhD candidate Academic advisor: Dr. Joseph Picone Department of Electrical and Computer Engineering, Temple University

Outline Advances in Neurology community and QEEG tools
Intra-patient variation in seizure morphology Support Vector Machine (SVM) Convolutional Neural Network (CNN) Doubly Convolutional Neural Network (DCNN)

Common issue in neurology community
Epileptic seizures affects approximately 1% of the world’s population. (Annegers, 1997) Scalp Electroencephalogram (EEG) monitoring is a non-invasive and convenient method to assess electrical activity from brain. Assessment of long-term monitoring (LTM) EEGs is tedious, time consuming and susceptible to missing events-of-interest such as seizures.

qEEG tools Quantitative EEG (qEEG) is the analysis of the digitized EEG. Various transformation techniques have been applied using DSP for visual interpretation of EEGs to assist and even augment our understanding of the EEG and brain function. DSP algorithms such as Fourier and Wavelet analysis have been applied to develop qEEG/cEEG slides such as envelop trend (aEEG), color density spectral array (CDSA), rhythmicity spectrogram, asymmetry index, etc..

Background

Seizure

Seizure offset

What if ? Patterns during Periodic/Rhythmic discharges or low amplitude status epilepticus would seem confusing.

Example of 1-hour QEEG panel

Experiment at Emory University
Total of 18 ANCS certified Neurophysiologists participated. 15 epochs were selected containing 126 seizures 9 Neurologists created the gold standard seizure data by reviewing raw EEGs. Remaining 9 reviewed it using two methods: QEEG + raw EEG QEEG only Automatic seizure detection algorithm (Persyst Inc.) was run on this data. 1 min. and 2.5 min. variation was allowed for detection of seizure onset. Allowed margins for Seizure identification Q QR SzD Sensitivity FA 1 min. Onset variation 51% 1/hour 63% 0.5/hour 25% 0.07/hour 2.5 min. Onset variation 67% 68% 27%

Results Prolonged Low Amplitude Seizures Brief Low Amplitude Seizures

qEEG tools Vs. SzD From comparison between implementation of QEEG tools Vs. SzD (automatic seizure detection algorithm). QEEG seems more reliable and its common practice in hospitals recently. On the other hand, reviewing QEEG slide’s could have less temporal resolution and it is prone to missing brief and/or slowly evolving seizures. SzD’s sensitivity is 26.2% to 26.7%.

Intra-patient varying seizure morphologies
Example: 1

Intra-patient varying seizure morphologies
Example: 2

Preprocessing EEG signals
EEG signal for P number of channels can be defined as: The nonlinear energy operator (NLEO): ψ i [n] = x i [n – 1] x i [n – 2] - x i [n] x i [n – 3] Frequency-weighted in left half of the window is subtracted from that of the right half of the window: 𝐺 𝜓 [n] = 𝑖=1 𝑃 | 𝑚=𝑛−𝑁+1 𝑛 𝜓 𝑖 m − 𝑚=𝑛+1 𝑛+𝑁 𝜓 𝑖 m | Threshold value for boundaries can be defined as: T[n] = max 𝐺 𝜓 𝑛− 𝐿 2 :𝑛+ 𝐿 2 , 𝑛≥ 𝐿 2 0, 𝑛< 𝐿 2 𝑋 𝑛 = [ 𝑥 1 [n]….. 𝑥 𝑝 [n]]

Pictorially….

Support Vector Machines
SVM is a discriminative classifier which constructs a hyperplane (or set of hyperplanes) in a higher dimensional space. To classify datapoints, SVM uses the sign of the decision function: f(x) = 𝑤 𝑇 𝑥 For standard two class SVM classification, solution to following formula is needed:

Multiclass SVM classification
In case of multiclass classification, the task specific separating hyper-plane is defined as: 𝑤 𝑡 = 𝑤 0 + 𝑣 𝑡 And optimization problem becomes: Once optimized, the multitask 𝑤 0 and 𝑣 𝑡 can be obtained from w of the standard SVM. 𝑤=( 𝜇 𝑤 0 , 𝑣 1 ,…, 𝑣 𝑇 )

Binary class hyperplane vs. Multiclass hyperplane
Let’s throw out task-specific component; then classification function becomes: f(x) = 𝑤 0 𝑇 𝑥

Results on CHBMIT seizure database

Let’s move on to Deep learning (CNN)
CNNs are special kind of neural network for processing data that has a grid like topology. Typical CNN Layer:

Whether it’s X shaped or not…?
Best features to recognize output image ? Intuitively we can see three of them here.

Convolution step to generate feature maps
For first Feature patch Similarly for all three features

Pooling 2 ×2 filter size with stride value set at 2.
Pooling layers reduce the spatial size and number of parameters associated with the network.

Detector stage Using non-linear detector function such as ReLU (in our case), sigmoid, etc.. Note that our output array has become sparse

Complex layer Note that multiple layers help reduce dimensionality.
Eventually, flattening the data where all the weights are shared with the following layers (Fully connected layers)

How did we get features in the first place ?
Backpropagation through stochastic gradient descent method can find out perfect weights for each layer and eventually features by reducing total error.

CNN continued Input image ℒ 𝜖 𝑅 𝑐 ×𝑤×ℎ is a real-valued 3D-tensor, where c is the number of channels; w, h are the width and height. Let’s define convolution operation as ℒ 𝑙+1 = ℒ 𝑙 ∗ 𝑊 𝑙 ℒ 𝑙+1 𝜖 𝑅 𝑐 𝑙+1 × 𝑤 𝑙+1 × ℎ 𝑙+1 becomes output image after convolution. The spatial dimension of the output image 𝑤 𝑙+1 , ℎ 𝑙+1 without zero padding remains 𝑤 𝑙 + z – 1 and ℎ 𝑙 +𝑧−1, respectively.

Doubly Convolutional Neural Network (DCNN)
ℒ 𝑙 𝜖 𝑅 𝑐 𝑙 × 𝑤 𝑙 × ℎ 𝑙 and ℒ 𝑙+1 𝜖 𝑅 𝑛𝑐 𝑙+1 × 𝑤 𝑙+1 × ℎ 𝑙+1 are the input and output image, respectively. 𝑾 𝒍 𝜖 𝑅 𝑐 𝑙+1 × 𝑐 𝑙 × 𝑧 ′ × 𝑧 ′ are a set of 𝑐 𝑙+1 meta-filters, with filter size 𝑧 ′ × 𝑧 ′ , 𝑧 ′ >𝑧. Zero padding is necessary in DCNN to preserve the shape of the filter for further convolution.

Main ingredient of DCNN
K-translation correlation between two convolutional filter within a same layer. In CNNs, learned filters are slightly translated version of each other. By considering correlation between filters inside meta-filters identifies maximally correlated filters.

Results of CNN variants Vs. DCNN on ImageNet
DCNN results on CIFAR-10, CIFAR-100 and ImageNet Results of CNN variants Vs. DCNN on CIFAR database with and without data augmentation Results of CNN variants Vs. DCNN on ImageNet

How to implement DCNN for seizure detection
NLEO segmentation can help us define the size of meta filter that we are interested in for DCNN operation. DCNN should be able to learn in more detail the dependency (correlation) between channels for artifact identification.

Summary and future work
Although, qEEG tools are reliable and becoming pervasive to detect most seizures, they are still not efficient to detect brief, slowly evolving or low amplitude seizures. Also, qEEG tools in conjunction with raw EEG are preferable to detect accurate onset and to reduce false positives. In Neuro-ICU and EMU environment, there is a strong need of automated seizure detection tool which can show significantly low FPR. Multitask learning approach to train classifiers for various seizure types is a good approach to avoid over-training on most common seizure types for SVM. DCNN generates meta-filters which looks for highly correlated filters inside the primary patch and then convolution takes place again with such output filters. This additional layer yields highly correlated feature maps then regular CNN approach. The useful approaches to be extracted (for artifact reduction) from these papers should be: NLEO operator for adaptive segmentation Applying this Segments as a meta filter size in DCNN approach. However, the question about detecting seizure with slow evolution and low amplitude remains unanswered.

Brief Bibliography Annegers, J. F. (1997). The treatment of epilepsy: Principle and practice. Baltimore: Williams and Wilkins. Van Esbroeck, A., Smith, L., Syed, Z., Singh, S., & Karam, Z. (2016). Multi-task seizure detection: addressing intra-patient variation in seizure morphologies. Machine Learning, 102(3), 309–321. Shoeb, A., Kharbouch, A., Soegaard, J., Schachter, S., & Guttag, J. (2011). An algorithm for detecting seizure termination in scalp EEG. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS (pp. 1443–1446). Goodfellow, I., Bengio, Y., & Courville, A. (2017). Deep Learning (1st ed.). Cambridge, MA, USA: MIT Press. Zhai, S., Cheng, Y., Zhang, Z. (Mark), & Lu, W. (2016). Doubly Convolutional Neural Networks. In NIPS (pp. 1082–1090). Haider, H. A., Esteller, R., Hahn, C. D., Westover, M. B., Halford, J. J., Lee, J. W., … Pargeon, K. (2016). Sensitivity of quantitative EEG for seizure identification in the intensive care unit. Neurology, 87(9), 935–44. Brandon, R. (August 18, 2016). How do Convolutional Neural Networks work? Retrieved from:

Thank You !

Deep Learning Approaches to Automate Seizure Detection

Similar presentations

Presentation on theme: "Deep Learning Approaches to Automate Seizure Detection"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Deep Learning Approaches to Automate Seizure Detection

Similar presentations

Presentation on theme: "Deep Learning Approaches to Automate Seizure Detection"— Presentation transcript:

Similar presentations

About project

Feedback