Rami Khushaba, PhD
  • Home
  • Publications
  • YouTube Channel
  • Matlab Code
  • Electromyogram (EMG) Repository
  • EMG-Blog
  • Consumer Neuroscience
  • Muscle Computer Interfacing
  • PhD Thesis
  • Contact Me
  • Achievements
  • Code and Datasets(Reviewers)

Tutorial-1: EMG/EEG Channel and Feature Selection with DEFS.

2/24/2015

1 Comment

 
Picture
Assuming a problem in which we have a certain number of EMG or EEG sensors or simply channels, we refer to the total number of channels as Nc. Out of each channel we usually extract a number of statistical descriptors that we call features, we refer to the total number of features as Nf. The problem of channel selection simply refers to the process of looking for the combination of channels that works the best together. Say for example in a problem with 8 channels, you may find that channel 1,5,7,and 8 are the 4 channels that work the best together (just a naive example). Be aware though, that when you perform channel selection then you are actually using all of the extracted features from these channels. Referring back to our example of 4 channels, if we have 5 features/channel and we selected 4 channels then we end up with 20 features (5  features/channel x 4 channels).  

Differential Evolution Feature Selection (DEFS) is one candidate method that can be used for feature selection. It is simply some form of genetic algorithms (GA). It looks for the combination of channels that works the best together. Unlike other methods like the Sequential forward and sequential backword selection (SFS and SBS), these methods search for the best subset of channels by individually assessing the usefulness of each channel. It selects the first best channel, then the second best channel that works the best with the first selected, then the third channel that works the best with the first two selected channels, and so on. This is a problem as in some cases we may have for example channel 1 and 5 can perform poor when you look at each of them individually, but if you combine them together then they can perform really well together as they complement each other. SFS might for example assess the channels and find that channels 5 is  not the first best channel, and could select channel 8 for that and builds the entire solution based on channel 8, which is a problem as you already missed the optimal solution. DEFS on the other hand looks for the best combination rather than performing an assessment of individual channels.

Please be aware these optimization algorithms like DEFS and others are made of a search engine and a fitness function. DEFS  employs a specific variation of differential evolution (DE) that we developed in our paper as the search engine while using the classification error rate as the fitness function. It is totally up to you to use any fitness function you like, e.g., mutual information, Fisher score, or any other fitness function. DEFS is also a population based method, i.e., you need to define the population size before using the method.
 
Now to run the DEFS code (available in the matlab code section), you need to define few variables as follows:

% How many channels out of the total number of channels do you want to select, lets say 5
DNC = 5;                % this is the desired number of channels

% population size, usually 50 is enough, the larger the faster to find the
% optimal solution but the slower to run the algorithm
PSIZE = 50;

% are you loading some stored data, 1 if loading stored data, 0 otherwise
Ld = 0;

% select your classifier for fitness function, LDA for example, I have 3 classifier stored as 0
% 1, and 2
classif =0;

% number of iterations allowed. increase this for huge problems
GEN = 50;

% number of features per channel (NFPC), 5 per example above
NFPC = 5;

% run the DEFS_chs
[ErrCH,SubsetCH] = DEFS_chs(data_tr,data_ts,DNC,PSIZE,Ld,classif,GEN,NFPC);

When the algorithm stops, the indices of the selected channels will be stored in  
SubsetCH and the associated error rate will be stored in ErrCH

If you are interested in feature selection, then we have made a small change to the code to select the best combination of features from all channels. Does this mean that feature selection could perform better than channel selection? the answer is Yes, it can.

DNF = 10; % This is the desired number of features to be selected out of the total Nf.

[ErrFS,SubsetFS] = DEFS(data_tr,data_ts,DNF,PSIZE,Ld,'LDA',GEN);
When the algorithm stops, the indices of the selected feature will be stored in  
SubsetFS and the associated error rate will be stored in ErrFS

Of course data_tr is the training feature matrix organized as per figure above, and data_ts is the testing feature set that is also organized as per figure above. For formal testing, we usually have a training feature set, a validation set, and a testing set. You train and validate while keeping the test set away and then finally test the feature subset you got on the test set.

Code here

1 Comment

    RKhushabA

    I am planning to dedicate this blog to EMG and EEG signals processing.

    Archives

    January 2017
    February 2015

    Categories

    All

Powered by Create your own unique website with customizable templates.