Classifiers

Parent Previous Next

The following classifiers are available (depending on your NPXLab Edition) to all the BCI related tools:


FLDA

FLDA (Fisher Linear Discriminant Analysis), or simply, LDA, makes use of a linear mapping (discriminant) function to discriminate among classes. It tries to find a linear hyperplane that is able to separate the samples in the features space according to the class they belong to, under the assumptions that the classes are Gaussian distributed and have the same covariance matrix. The equation of the hyperplane is:


wTx + b = 0,



where w represents the classification weights, x is the feature vector and b is the bias term. In the binary case, an incoming feature vector is assigned to class 1 if wTx + b > 0, and assigned to class 2 if wTx + b < 0.

In particular, the Fisher’s criterion [1] tries to find a w that maximizes the ratio of between-classes and within-classes variances. FLDA, in general, achieves good performances, unless the number of features to be estimated is much higher than the number of sample data. For this reason, different versions of the LDA have been implemented, that take into account some regularization parameters [2] in order to overcome the problem of the bad dimensionality. These classifiers will be object of the next release.


SWLDA

Stepwise Linear Discriminant Analysis (SWLDA) belongs to the family of LDAs and is the most used classifier in the literature for P300 protocols [3-6] as it is simple to implement and manage and achieves quite good performance. It still tries to find a linear separating hyperplane, but the approach is different with respect to FLDA. In fact, the parameters to be inserted in the model are chosen by combining backward and forward steps, that is, features are inserted or removed by the model according to their statistical significance in the prediction of the class labels: for example, features that have a p-value <0.05 could be inserted in the discriminant function and features that have a p-value >0.10 could be removed. Obviously, different p-values can be chosen (creating different versions of the same classifiers) and the insert/remove process can be iterated until a predefined maximal number of iterations is reached or until no features satisfy the requirements.

 

SVM

Support Vector Machines have been successfully used in the BCI research [7-9] as they are less sensible to dimensions factors as compared to the LDAs and have better possibilities to avoid over-fitting problems. An SVM [1] tries to find a hyperplane for the separation of classes that maximizes the distances to two parallel hyperplanes on each side of the hyperplane itself. The two hyperplanes contain the support vectors, that are the training samples of any class, closest to the hyperplane. Usually SVM uses a penalty parameter C for the regularization, that takes into account misclassifications due to outliers.

SVM can be used both for linearly separable data (so linear decision function) and for non linearly separable ones (non linear decision functions); in the second case, it is necessary to map the data into a higher dimensionality space by means of a kernel function that can be polynomial, based on Radial Basis Functions, gaussian or sigmoid. Usually, the best way to find the parameter C and the best kernel to adopt, is to use a cross-validation procedure.

The SVM algorithms implementation, used in the software released in this deliverable [10], is available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.


ANN

Artificial Neural Networks [11] have been widely used for the classification of BCI data [12-15], as, due to their properties, they can approximate whatever functions and so solve complex classification problems.

A NN is in general constituted by different layers of neurons, interconnected among themselves by means of weighted connections; the simplest NN is constituted by an input layer, an intermediate (hidden) layer and an output layer. In the BCI case, the inputs are the samples to be classified and the outputs are the labels relative to the classified class. The idea under a NN is that its parameters can be adjusted in order the NN exhibits the desired behaviour; the amount of adjustment is determined by a factor called learning rate. Each neuron is characterized by a transfer (activation) function that determines the relationship between the input activity and the weighted output of the neurons and can be linear, threshold, sigmoid, etc. Other parameters can be set in a NN such as the maximum number of training epochs or the desired error threshold.

The ANN implementation, used in the software released in this deliverable, makes use of a free library downloadable at http://leenissen.dk/fann/.


Bayesian Linear Discriminant Analysis (BLDA)

BLDA is a regularized version of Fisher’s Linear Discriminat analysis and was already used for P300-based BCIs [16-17]. It can manage high dimensionality and noisy datasets, therefore resulting less sensible to bad dimensionality of data and to outliers. The main goal of Bayesian classification is to assign a feature vector to the proper class it belongs to with the highest probability.


Regularized Linear Discriminant Analysis (RLDA)

RLDA is a regularized version of FLDA. It tries to regularize the covariance matrices () for each class of data, that are needed for the calculation of the weights of the hyperplane and whose computation is not reliable when the number of features is high with respect to the number of observations. A regularization parameter,γ, belonging to the range [0,1], needs to be found so that the new covariance matrices () are in the following form:

where is defined as , d is the dimensionality of the features space and is the identity matrix. Note that if γ=0 the problem turns back to a FLDA problem, while if γ=1 the problem assumes that the covariance matrices are multiple of the identity matrix. Usually, to find the parameter γ, a cross-validation needs to be implemented.


Shrinkage-Regularized Linear Discriminant Analysis (sRLDA)

The regularization implemented in this classifier is that described by Blankertz et al. in [2]. To find the regularization parameter, a formula has been implemented and therefore there is no need for cross-validation procedures. This means that this method is less time demanding that a simple RLDA, even if with they achieve similar performances in terms of classification accuracy.


K-Nearest Neighbour (kNN)

kNN classifies a sample in the validation set on the basis of the k closest samples in the training set. Differently from the other classifiers described here, there is not a real “training phase” for this classifier, as just a storage of the training samples and of the relative class labels is needed. Then, in the validation phase, a sample is assigned a label according to the most frequent labels in the k closest sample. K can be defined by means of a cross-validation procedure. Usually the Euclidean or the Mahalanobis distances are set as distance metrics.



REFERENCES


[1]        A. Webb, Statistical pattern recognition, Newnes, 1999.

[2]        B. Blankertz, S. Lemm, M. Treder, S. Haufe, and K. Müller, “Single-trial analysis and classification of ERP components - A tutorial,” NeuroImage, Jun. 2010.

[3]        E.W. Sellers, D.J. Krusienski, D.J. McFarland, T.M. Vaughan, and J.R. Wolpaw, “A P300 event-related potential brain-computer interface (BCI): the effects of matrix size and inter stimulus interval on performance,” Biological Psychology,  vol. 73, Oct. 2006, pp. 242-252.

[4]        F. Nijboer, E.W. Sellers, J. Mellinger, M.A. Jordan, T. Matuz, A. Furdea, S. Halder, U. Mochty, D.J. Krusienski, T.M. Vaughan, J.R. Wolpaw, N. Birbaumer, and A. Kübler, “A P300-based brain-computer interface for people with amyotrophic lateral sclerosis,” Clinical Neurophysiology: Official Journal of the International Federation of Clinical Neurophysiology,  vol. 119, Aug. 2008, pp. 1909-1916.

[5]        A. Furdea, S. Halder, D.J. Krusienski, D. Bross, F. Nijboer, N. Birbaumer, and A. Kübler, “An auditory oddball (P300) spelling system for brain-computer interfaces,” Psychophysiology,  vol. 46, May. 2009, pp. 617-625.

[6]        D.J. Krusienski, E.W. Sellers, D.J. McFarland, T.M. Vaughan, and J.R. Wolpaw, “Toward enhanced P300 speller performance,” Journal of Neuroscience Methods,  vol. 167, Jan. 2008, pp. 15-21.

[7]        M. Kaper, P. Meinicke, U. Grossekathoefer, T. Lingner, and H. Ritter, “BCI Competition 2003--Data set IIb: support vector machines for the P300 speller paradigm,” IEEE Transactions on Bio-Medical Engineering,  vol. 51, Jun. 2004, pp. 1073-1076.

[8]        A. Rakotomamonjy and V. Guigue, “BCI competition III: dataset II- ensemble of SVMs for BCI P300 speller,” IEEE Transactions on Bio-Medical Engineering,  vol. 55, Mar. 2008, pp. 1147-1154.

[9]        M. Thulasidas, C. Guan, and J. Wu, “Robust classification of EEG signal for brain–computer interface,” IEEE Transactions on Neural Systems and Rehabilitation Engineering,  vol. 14, 2006.

[10]        C.C. Chang and C.J. Lin, LIBSVM: a library for support vector machines, Citeseer, 2001.

[11]        J.C. Príncipe, N.R. Euliano, and W.C. Lefebvre, Neural and adaptive systems: fundamentals through simulations, Wiley, 2000.

[12]        D. Coyle, G. Prasad, and T. McGinnity, “Faster Self-Organizing Fuzzy Neural Network Training and a Hyperparameter Analysis for a Brain&#x2013;Computer Interface,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics),  vol. 39, 2009, pp. 1458-1471.

[13]        D. Coyle, “Neural network based auto association and time-series prediction for biosignal processing in brain-computer interfaces,” IEEE Computational Intelligence Magazine,  vol. 4, 2009, pp. 47-59.

[14]        E. Haselsteiner and G. Pfurtscheller, “Using time-dependent neural networks for EEG classification,” IEEE Transactions on Rehabilitation Engineering,  vol. 8, Dec. 2000, pp. 457-463.

[15]        H. Cecotti and A. Graser, “Convolutional Neural Networks for P300 Detection with Application to Brain-Computer Interfaces,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Jun. 2010.

[16]        Hoffmann U, Vesin J, Ebrahimi T, Diserens K. An efficient P300-based brain-computer interface for disabled subjects. J Neurosci Methods, 2008, 167(1):115-25.

[17]        Jin J, Allison BZ, Brunner C, Wang B, Wang X, Zhang J, Neuper C, Pfurtsceller G. P300 Chinese input system based on Bayesian LDA. Biomed Tech (Berl), 2010, 55(1):5-18.  

Created with the Personal Edition of HelpNDoc: Single source CHM, PDF, DOC and HTML Help creation