Functional Brain Signal Processing: EEG & fMRI Lesson 7 Kaushik Majumdar Indian Statistical Institute Bangalore Center M.Tech.

Functional Brain Signal Processing: EEG & fMRI Lesson 7 Kaushik Majumdar Indian Statistical Institute Bangalore Center kmajumdar@isibang.ac.in M.Tech. (CS), Semester III, Course B50

EEG Coherence Measures Cross-correlation. Covariance:

EEG Feature Extraction Features of EEG signals can be in myriad different forms, such as: Amplitude Phase Fourier coefficients Wavelet coefficients, etc.

Two Most Fundamental Aspects of Machine Learning Differentiation: decomposing the data into features, and Integration: classification of those features.

Fisher’s Discriminant Duda, Hart & Stork, 2006

Fisher’s Discriminant (cont.) There are n d-dimensional data vectors x 1, ….., x n, out of which n 1 vectors belong to a set D 1 and n 2 vectors belong to another set D 2. n 1 + n 2 =n. w is a d- dimensional weight vector such that ||w|| = 1. That is w can apply rotation only. The rotation will have to be such that D 1 and D 2 are optimally separable by a projection on a straight line in the d- dimensional space.

Fisher’s Discriminant (cont.) Sample mean is an unbiased estimate of the population mean. So difference in mean ensures difference in population.

Fisher’s Discriminant (cont.) Fisher’s discriminant employs that particular value of the expression for which the criterion function is to be maximized. D1D1 D2D2

Fisher’s Discriminant (cont.) and Sinceand Let us define

Fisher’s Discriminant (cont.) Similarly where S w is called within class scatter matrix and S B is called between class scatter matrix.

Fisher’s Discriminant (cont.) J(w) is always a scalar quantity and therefore must hold for a scalar valued function f of a vector variable w, because w T (S B – f(w)S w )w = 0. Clearly, maximum f(w) will make J(w) maximum. Let maximum f(w) =. Then we can write where w is the vector for which J(w) is maximum. S B w is in direction of m 1 – m 2 (elaborated in the next slide). Also scale of w does not matter, only direction does. So we can write

Fisher’s Discriminant (cont.) or Note that Here all vectors are by default column vector, if not stated otherwise. So, all transpose operations give row vectors. (m 1 – m 2 ) T is a row vector and w is a column vector. Therefore the value within the second bracket above is a scalar. That is S B w = (m 1 – m 2 )s, where s is a scalar. This implies S B w is in the direction of m 1 – m 2.

Dimensionality Reduction by Fisher’s Discriminant From we get, where is a d-dimensional identity matrix. and are d-dimensional square matrices. For the purpose of classification (or pattern recognition) we only need those eigenvectors of whose associated eigenvalues are large enough. The rest of the vectors (and therefore dimensions) we can ignore.

Logistic Regression

Logistic Regression (cont.) Parra et al., NeuroImage, 22: 342 – 452, 2005 p(y) 1 - p(y)

Logistic Regression vs. Fisher’s Discriminant Theoretically it has been shown that logistic regression is shown to be between one half and two thirds as effective as normal discrimination for statistically interesting values of parameters (B. Effron, The efficiency of logistic regression compared to normal discriminant analysis, JASA (1975) 892-898).

Logistic Regression (cont.) to be maximized, N is number of data points

Logistic Regression (cont.) Note that is a monotonically increasing function and so any set which increases will lead us closer to the optimal value of. Even if we take and the end result for EEG signal separation for target and non-target or for different targets will almost be similar to the case when a convergence technique for as described is followed. The two classes of data will be separated by the hyperplane normal to and the perpendicular distance of the hyperplane from origin is. In other words the equation of the hyperplane is.

Logistic Regression vs. Fisher’s Discriminant FD projects the multidimensional data on a line, whose orientation is such that the separation of the projected data becomes maximum on that line. LR assigns probability distribution to the two different data sets in a way that the distribution approaches 1 on one class and 0 on another, exponentially fast. This makes LR a better separator or classifier than FD.

References R. Q. Quiroga, A. Kraskov, T. Kreuz and P. Grassberger, On performance of differnet synchronization measures in real data: a case study on EEG signals, Phys. Rev. E, 65(4): 041903, 2002. R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification, 4e, John Wiley & Sons, New York, 2007, p. 117 – 121.

THANK YOU This lecture is available at http://www.isibang.ac.in/~kaushikhttp://www.isibang.ac.in/~kaushik

Functional Brain Signal Processing: EEG & fMRI Lesson 7 Kaushik Majumdar Indian Statistical Institute Bangalore Center M.Tech.

Similar presentations

Presentation on theme: "Functional Brain Signal Processing: EEG & fMRI Lesson 7 Kaushik Majumdar Indian Statistical Institute Bangalore Center M.Tech."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Functional Brain Signal Processing: EEG & fMRI Lesson 7 Kaushik Majumdar Indian Statistical Institute Bangalore Center M.Tech.

Similar presentations

Presentation on theme: "Functional Brain Signal Processing: EEG & fMRI Lesson 7 Kaushik Majumdar Indian Statistical Institute Bangalore Center M.Tech."— Presentation transcript:

Similar presentations

About project

Feedback