By Ahmed Mohamed Hemaly Mohamed Nour El-Din Ismail Osama Mohamed Mohamed Raghda Hassan Mohamed Rania Nabil Refaat Walid Ezzat Shouman
Supervisor Dr. Seif Eldawlatly
Submitted in Partial Fulfillment of the Requirement for the B.Sc. Degree,
Computer and Systems Engineering Department, Faculty of Engineering, Ain Shams University, Cairo, Egypt
June 2014
Student Statement Members of Braingizer team are committed to the ethical standards in designing and implementing the project. We are committed to: 1. The Open-Source Community: Braingizer is an Open-Source project that is fully available for the developers and researchers to use. 2. Serving the Society: Braingizer is targeting the handicapped who can't move or express their needs but their brain functions well.
Project Abstract Recent advances in brain activity recording technologies have enabled a multitude of applications that could help people with motor disabilities. This project aims at controlling the movement of a wheelchair by learning the corresponding electroencephalogram (EEG) features recorded from the scalp. The project capitalizes on the extensive research that has been carried out on using spectral analysis of EEG activity to estimate different physiological and psychological states. Spectral signatures that correspond to different movement directions are extracted from the recorded EEG. Such signatures are used to train different machine learning algorithms that are subsequently used to decode the users EEG activity to infer the intended movement. The inferred movement directions are then used to control the movement of a wheelchair through a single-board computer (Radxa-Board) mounted on the wheelchair. The project uses the Emotiv EEG recording headset to record brain activity. The wheelchair also has ultrasonic sensors, which protect the user from hitting an obstacle and it also help the motion algorithm in taking an intelligent decision based on the surrounding obstacles and the decoded EEG signals. Our results indicate an acceptable performance to control the wheelchair using brain activity.
Acknowledgement First and foremost, we would like to thank our supervisor of this project Dr. Seif Eldawlatly for the helpful assistance, support and guidance. His willingness to motivate us tremendously to our project. We also would like to thank him for His huge effort in continuous contacting and providing us with all what we need regarding information, recent updates for the graduation project needs and his beneficial Lectures. Besides, we would like to thank the Information Technology Industry Development Agency (ITIDA) for believing in us and funding the Braingizer project. Also, we would like to thank the Open-Source community for their help, specially Ozan alayan, the python-emotiv developer. Finally, an honorable mention goes to our families and friends for supporting us.
Contents Preface .................................................................................................................................. 1 Chapter 1: Biomedical Introduction ....................................................................................... 3 1.1 Introduction...................................................................................................................... 4 1.2 Target segment ................................................................................................................ 4 1.3 Brain Structure [3] ........................................................................................................... 5 1.3.1 Cerebrum ........................................................................................................................ 5 1.4 BCI..................................................................................................................................... 7 1.4.1 Neural Interfaces ............................................................................................................ 8 1.4.2 Mental Strategies and Brain Patterns ............................................................................ 9 Chapter 2: Signal processing and Feature extraction ............................................................ 13 2.1 Introduction.................................................................................................................... 14 2.2 Signal processing: ........................................................................................................... 14 2.2.1 Noise Removal: Common Average Rejection (CAR) ..................................................... 14 2.2.2 Filtering ......................................................................................................................... 15 2.3 Preprocessing methods .................................................................................................. 18 2.3.1 Principal Component Analysis (PCA): ........................................................................... 18 2.3.2 Linear Discriminant Analysis (LDA) ............................................................................... 19 2.3.3 Independent Component Analysis (ICA): ..................................................................... 21 2.3.4 Common Spatial Pattern (CSP): .................................................................................... 22 2.3.5 PCA vs ICA vs LDA: ........................................................................................................ 23 2.4 Feature extraction: ......................................................................................................... 24 2.4.1 Discrete Fourier Transform (DFT) vs Welchs Method: ................................................ 24 2.4.2 Applying some features with DFT [14]: ........................................................................ 25 2.4.3 Wavelet Packet Decomposition (WPD): ....................................................................... 26 2.4.4 Auto-Regressive model (AR): ........................................................................................ 28 Chapter 3: Machine Learning ............................................................................................... 29 3.1 Motivation ...................................................................................................................... 30 3.2 Machine Learning Types ................................................................................................ 31 3.2.1 Supervised Learning ..................................................................................................... 31 3.2.2 Unsupervised Learning ................................................................................................. 31 3.2.3 Reinforcement learning ................................................................................................ 32 3.3 Classifiers Types ............................................................................................................. 33 3.3.1 Linear Classifiers ........................................................................................................... 33 3.3.2 Non-linear Classifiers .................................................................................................... 34 3.3.3 Which to Use? ............................................................................................................... 35 3.4 Linear Classifiers: ............................................................................................................ 36 3.4.1 Least-squares classifier: ................................................................................................ 36 3.4.2 Fishers Linear Discriminant Classifier: ......................................................................... 36 3.4.3 Perceptron Classifier:.................................................................................................... 37 3.5 Non-linear Classifiers:..................................................................................................... 39 3.5.1 Maximum Likelihood Discriminant Classifier: .............................................................. 39 3.5.2 KNN: .............................................................................................................................. 40 Chapter 4: Training, Detection and Analysis ......................................................................... 43 4.1 Introduction.................................................................................................................... 44 4.2 Braingizer Trainer ........................................................................................................... 45 4.2.1 Introduction .................................................................................................................. 45 4.2.2 How to use? .................................................................................................................. 46 4.2.3 Application Architecture ............................................................................................... 48 4.2.4 Design Aspects .............................................................................................................. 48 4.2.5 Training file format ....................................................................................................... 49 4.3 Braingizer Detector ........................................................................................................ 50 4.3.1 Introduction .................................................................................................................. 50 4.3.2 How to use? .................................................................................................................. 51 4.3.3 Connect to the wheelchair ........................................................................................... 53 4.3.4 Run on the Radxa board ............................................................................................... 54 4.3.5 Application Architecture ............................................................................................... 54 4.3.6 Design Aspects .............................................................................................................. 55 4.3.7 Detection file format .................................................................................................... 56 4.4 Braingizer OnlineDetector .............................................................................................. 57 4.4.1 Introduction .................................................................................................................. 57 4.4.2 Run on the Radxa board ............................................................................................... 57 4.5 BAnalyzer ........................................................................................................................ 58 4.5.1 Motivation .................................................................................................................... 58 4.5.2 Installing........................................................................................................................ 58 4.5.3 Usage ............................................................................................................................ 59 4.5.4 Statistical Results .......................................................................................................... 66 4.6 Linux SDK ........................................................................................................................ 66 Chapter 5: Wheelchair System ............................................................................................. 67 5.1 Embedded Linux ............................................................................................................. 68 5.1.1 Introduction .................................................................................................................. 68 5.1.2 Raspberry-Pi Board ....................................................................................................... 68 5.1.3 Radxa-Rock Board ......................................................................................................... 69 5.2 Control ............................................................................................................................ 70 5.2.1 Arduino with servo motor ............................................................................................ 70 5.2.2 Arduino with Ultrasonic ................................................................................................ 70 5.2.3 Arduino with Drivers ..................................................................................................... 71 5.3 Power Supply .................................................................................................................. 73 5.3.1 Power supply & charging circuit and connections ....................................................... 73 Chapter 6: Project Results .................................................................................................... 75 6.1 Introduction.................................................................................................................... 76 6.2 Methodologies ............................................................................................................... 76 6.2.1 Default Setup ................................................................................................................ 76 6.2.2 Limb Movement ........................................................................................................... 77 6.2.3 Motor Imagery .............................................................................................................. 77 6.2.4 Mathematical Operations............................................................................................. 78 6.2.5 Continuous Muscle Movement .................................................................................... 79 6.2.6 Facial Expression ........................................................................................................... 80 6.2.7 Headset Tilting .............................................................................................................. 81 6.2.8 Varying Detection Duration .......................................................................................... 82 6.2.9 Summary ....................................................................................................................... 82 6.3 Conclusion ...................................................................................................................... 83 References .......................................................................................................................... 85
List of Figures and Tables Figure 1-1: Brain Structure .............................................................................................................. 5 Figure 1-2: Left and Right brain functions. ..................................................................................... 5 Figure 1-3: Cerebrum lopes ............................................................................................................ 6 Figure 1-4 cerebral cortex ............................................................................................................... 7 Figure 1-5 Brain Homonculus.......................................................................................................... 7 Figure 1-8 Invasive BCI .................................................................................................................... 8 Figure 1-9 Non-invasive BCI ............................................................................................................ 8 Figure 1-10 resolution versus recording method ........................................................................... 9 Figure 1-12 ERD/ERS ..................................................................................................................... 11 Figure 1-13 electrodes location .................................................................................................... 11 Figure 2-1: Comman Average Rejection ....................................................................................... 14 Figure 2-2: Filter types .................................................................................................................. 16 Figure 2-3: Filter order effect ........................................................................................................ 16 Figure 2-4: IIR Filter ....................................................................................................................... 17 Figure 2-7: Data after LDA transform (two components) for the first session of first subject. ... 21 Figure 2-5: ICA estimation example. ............................................................................................. 22 Figure 2-8: ERD and ERS on C3 and C4 in band power 11-13 Hz. ................................................. 26 Figure 2-9: The structures of wavelet decomposition. ................................................................. 26 Figure 2-10: The structures of WPD.............................................................................................. 27 Figure 3-1: Features separability .................................................................................................. 30 Figure 3-2 Unsupervised learning example. ................................................................................. 31 Figure 3-3: Linear classifiers example ........................................................................................... 33 Figure 3-4: Non-linear classification ............................................................................................. 34 Figure 3-5: Non-linear classifier of high order with separable data. ............................................ 34 Figure 3-6: Feature dimentionality ............................................................................................... 35 Figure 4-1: Braingizer's main programs ........................................................................................ 44 Figure 4-2: A volunteer is performing a training session. ............................................................. 45 Figure 4-3: The training program. ................................................................................................. 45 Figure 4-4: Shell-script run.sh output. .......................................................................................... 46 Figure 4-5: Connect button. .......................................................................................................... 46 Figure 4-6: Trainer Main Window. ................................................................................................ 46 Figure 4-7: Welcome Screen. ........................................................................................................ 47 Figure 4-8: Random blue arrow. ................................................................................................... 47 Figure 4-9: Application Structure. ................................................................................................. 48 Figure 4-10: The CSV file for a training session. ........................................................................... 49 Figure 4-11: The Detector Program. ............................................................................................. 50 Figure 4-12: Detector Main-Window. ........................................................................................... 51 Figure 4-13: Welcome Screen. ...................................................................................................... 52 Figure 4-14: The user is asked to perform a right action.............................................................. 52 Figure 4-15: Accuracy bar. ............................................................................................................ 53 Figure 4-16: The program is searching for the motors over ports. .............................................. 53 Figure 4-17: Application Structure. ............................................................................................... 55 Figure 4-18: Selecting Classifier Script. ......................................................................................... 55 Figure 4-19: Detection CSV file. .................................................................................................... 56 Figure 4-20 install git and python ................................................................................................. 58 Figure 4-21 install dependancies .................................................................................................. 58 Figure 4-22 run bAnalyzer ............................................................................................................. 59 Figure 4-23 Main windows GUI .................................................................................................... 59 Figure 4-24 SelectionTraining path ............................................................................................... 60 Figure 4-25 generating bAnalyzer GUI .......................................................................................... 60 Figure 4-26 Selection detection mode .......................................................................................... 61 Figure 4-27 single mode result ..................................................................................................... 61 Figure 4-28 Bulk mode GUI ........................................................................................................... 62 Figure 4-29 select bulk paths ........................................................................................................ 62 Figure 4-30 check weather to update Goodle spread shoot or not ............................................. 62 Figure 4-31 bulk mode result ........................................................................................................ 63 Figure 4-32 features preview screen ............................................................................................ 63 Figure 4-33 features plot .............................................................................................................. 64 Figure 4-35 features navigation GUI ............................................................................................. 64 Figure 4-34 features main UI ........................................................................................................ 64 Figure 4-36 select same file data for detection ............................................................................ 65 Figure 4-37 select ALL ................................................................................................................... 65 Figure 4-38 select certain offset ................................................................................................... 65 Figure 4-39 cross validation .......................................................................................................... 66 Figure 4-40 configuration files ...................................................................................................... 66 Figure 5-1: Raspberry-Pi Board. .................................................................................................... 68 Figure 5-2: Radxa-Rock Board. ...................................................................................................... 69 Figure 5-3: Servo motor connections. .......................................................................................... 70 Figure 5-4: Ultrasonic connections. .............................................................................................. 71 Figure 5-5: Fixing the ultrasonic to the servo-motor. ................................................................... 71 Figure 5-6: PWM ........................................................................................................................... 72 Figure 5-7: 12-volts Lead-Acid Battery.......................................................................................... 73 Figure 5-8: Solar Charge Control. .................................................................................................. 73
Preface Preface This documentation introduces the usage of machine learning in creating a Brain Computer Interface (BCI) system represented in brain-controlled-wheelchair, accompanied with some important concepts that we think are mandatory for anyone who wants to tackle this field, and more importantly some practices, tooling and techniques we've heavily relied on through the journey of creating our BCI based wheelchair. Readers shall find this documentation to be as clear and easy to understand the concepts we follow and techniques we developed as possible. However, if the reader seeks recreating the whole project he/she may need some advanced skills in biomedical engineering, embedded Linux, vector-matrix manipulation, electrical circuits, software engineering & microcontroller programming. The reader is supposed to have a basic knowledge about electrical engineering. Through this book there will be some introductory material about biomedical engineering, signal processing, feature extraction & machine learning; those are covered in Chapters 1,2 and 3. Chapter4 illustrates our developed software tools to train, detect and analyze those brain signals. Chapter 5 covers the details about the wheelchair system. Chapter 6 includes results we acquired through analyzing different subject actions. Chapter 7 discusses some future work. Special effort has been made to provide many project tutorials and how-tos, so that the reader would have a clear understanding of what it takes to create a brain-controlled wheelchair. The project has three main parts shown in the following block diagram
Figure 0-1: Projects Block Diagram 2 | P a g e
Which is a single-board computer. This board implement machine learning algorithms to classify detected data and send commands for the wheelchair to move in the direction intended by the user. We believe in the power of open source and using free software during development of the project so we hacked some tools and created others. Now we have our own set of software that were used for training, detection and analysis [1].
Figure 0-2: Braingizer's main programs Algorithms used in this project tested on BCI competition data set (iiia) [2] which is motor imagery dataset. After making sure that algorithms are working correctly these algorithms where tested on real subjects with different methods. Some methods with specific actions from the user reached 100% accuracy other reached 60% more details about each method used and the resulted accuracy will be explained later through this book. 3 | P a g e
1.1 Introduction Biomedical engineering (BME) is the application of engineering principles and design concepts to medicine and biology for healthcare purposes. It combines the design and problem solving skills of engineering with medical and biological sciences to advance health care treatment, including diagnosis, monitoring, and therapy. Biomedical engineering recently got a great trend, compared to many other engineering fields. Research publications rate increased and many applications are available nowadays. Much of the work in biomedical engineering consists of research and development, spanning a broad array of subfields. Prominent biomedical engineering applications include the development of bionics, various diagnostic and therapeutic medical devices ranging from clinical equipment to micro- implants, common imaging equipment such as MRIs and EEGs and therapeutic biologicals. 1.2 Target segment One of our biggest fears is being trapped in small, enclosed spaces. Imagine if being trapped inside your own body. Most of wheelchair users have completely functional cognitive ability; the only problem is the inability of movement. Wheelchairs are the most common assistive or mobility devices for enhancing mobility with dignity. And enhancing the quality of life of their users has gained a broad interest through time. This segment consists of about 700 Million users around the globe (1% of the total population)[1-1], this percentage increases with time which could be considered a real issue in the future Some of those users have a severe spinal cord damage that may result in total inability in controlling even their own wheelchair, even though the users brain is most probably well- functioning, and we believe that their disability shouldn't be the obstacle preventing them from traveling even the long distances. This kind of damages had a great effect in increasing the usage of brain machine interface (BMI) in the research and development to give those people a much independent and dignified life. The positive thing is that the on-going research trends are moving swiftly. However, there's no current solution that gives the wheelchair users total independence throughout all their life.
5 | P a g e
1.3 Brain Structure [3] The brain is a large soft mass of nervous tissue and has three major parts (Figure 1-1):
Figure 1-1: Brain Structure 1.3.1 Cerebrum The cerebrum fills up most of your skull. It is involved in remembering, problem solving, thinking, and feeling. It also controls movement. The brain is divided into two hemispheres and four lobes, each of which specialize in different functions. The Cerebrum is divided into 2 hemispheres (Figure 1-2): a- Left hemisphere b- Right hemisphere
Figure 1-2: Left and Right brain functions.
The brain is divided into two halves called hemispheres. There is evidence that each brain hemisphere has its own distinct functions. 6 | P a g e
The left-side of the brain is considered to be adept at tasks that involve logic, language and analytical thinking. The left-brain is often described as being better at: Language, Logic. Critical thinking, Numbers, Reasoning According to the left-brain, right-brain dominance theory, the right side of the brain is best at expressive and creative tasks. Some of the abilities that are popularly associated with the right side of the brain include: Recognizing faces. Expressing emotions, Music, Reading emotions, Color, Images, Intuition, Creativity The left hemisphere is responsible for the right side of the body and vice versa
The Cerebrum lopes (Figure 1-3)
Figure 1-3: Cerebrum lopes The frontal lobe is responsible for initiating voluntary movement, analyzing sensory experiences, providing responses relating to personality, and mediating responses related to memory, emotions, reasoning, judgment, planning, and speaking. The parietal lobes respond to stimuli from cutaneous (skin) and muscle receptors throughout the body. The temporal lobes interpret some sensory experiences, store memories of auditory and visual experiences, and contain auditory centers that receive sensory neurons from the cochlea of the ear. The occipital lobes integrate eye movements by directing and focusing the eye and are responsible for correlating visual images with previous visual experiences and other sensory stimuli.
The outer layer of the cerebrum is called the cerebral cortex it has four major areas (Figure 1-4) a- Motor Cortex b- Somatosensory Cortex a- The frontal lobe b- The parietal lobes c- The temporal lobes d- The occipital lobes
7 | P a g e
c- Auditory Cortex d- Visual Cortex
Figure 1-4 cerebral cortex - Motor Cortex: Plans and executes voluntary movements - Somatosensory Cortex: Receives and processes information related to touch - Auditory Cortex: Receives and processes information coming from the ear - Visual Cortex: Receives and processes information coming from the eyes
The motor cortex and the somatosensory cortex are somatotopically organized: Each part of them corresponds to a certain part of the body (The Homunculus)( Figure 5).
Electrical stimulation of a certain area in the motor cortex causes movements of the corresponding body part.
Figure 1-5 Brain Homonculus 1.4 BCI A braincomputer interface (BCI) [4], sometimes called a mind-machine interface (MMI) is an artificial system that bypasses the bodys normal efferent pathways, which are the neuromuscular output channels. 8 | P a g e
BCI systems directly measures brain activity associated with the users intent then it translates the recorded brain activity into corresponding control signals required for applications. This translation involves signal processing and pattern recognition, which is typically done by a computer. Since the measured activity originates directly from the brain and not from the peripheral systems or muscles, the system is called a BrainComputer Interface. BCIs measure brain activity, process it, and produce control signals that reflect the users intent. BCI can enable a person suffering from paralysis to write a book or control a motorized wheelchair or prosthetic limb through thought alone. Current brain-interface devices require deliberate conscious thought. One of the biggest challenges in developing BCI technology has been the development of electrode devices and/or surgical methods that are minimally invasive. 1.4.1 Neural Interfaces It is how we read/record brain signals, we have two ways to do that either invasive or non- invasive. 1. Invasive Measuring brain activity with surgery. Invasive BCIs are directly implanted into the grey matter of the brain during neurosurgery. They produce the highest quality signals among BCI. 2. Non Invasive Measuring brain activity without surgery. Neuroimaging technologies as interface are used. Signals recorded in this way produce poor signal resolution but better for development and easier for use without surgery.
Figure 1-6 Invasive BCI Figure 1-7 Non-invasive BCI This figure shows the resolution versus the recording method used 9 | P a g e
Figure 1-8 resolution versus recording method 1.4.2 Mental Strategies and Brain Patterns The mental strategy is the main required foundation for BCI systems. The mental strategy determines what the user has to do to volitionally produce brain patterns that the BCI can interpret. The mental strategy also sets certain constraints on the hardware and software of a BCI, such as the signal processing techniques to be employed later. The amount of training required to successfully use a BCI also depends on the mental strategy. The most common mental strategies are selective (focused) attention and motor imagery. 1. Selective attention [5] BCIs based on selective attention require external stimuli provided by the BCI system. Most BCIs, are based on visual stimuli. That is, the stimuli could be different tones, different tactile stimulations, or flashing lights with different frequencies. Each stimulus is associated with a command that controls the BCI application. Visual attention can be implemented with two different BCI approaches. These approaches are named after the brain patterns they produce, which are called P300 potentials and steady-state visual evoked potentials (SSVEP). a- P300 P300 means positive wave of about 300 ms after the stimuli is presented. The P300 is also associated with surprise. For instance a grid of flashing letters are presented to the user. The user fixes his/her attention on a specific letter he/she wants to spell. Each time the letter flashes, a P300 is recorded, and thus the computer can recognize which letter the user was spotting. This technique has allowed lock-in 10 | P a g e
patients to communicate with the world, and it is considered the fastest brain- computer interface application in terms of bit rate.
b- SSVEP SSVEP systems are very simple as it depends on the recorded data from occipital electrodes. Which are responsible for visual information. If the screen is flickering at a certain rate (30 HZ) and the user is staring at this screen. The recored EEG signal will be at the same frequency which is 30HZ or its multiples.This method is widely used in many applications as it is very easy to detect diferrent flickering frequencies.
2. Motor imagery Moving a limb or even contracting a single muscle changes brain activity in the cortex. In fact, The preparation of movement or the imagination of movement alsocause change in the sensorimotor rhythms (SMR). Brain oscillations are categorized according to specific frequency bands (delta: < 4 Hz, theta: 47 Hz, alpha: 812 Hz, beta: 1230 Hz, gamma: > 30 Hz).
Alpha wave Range: between 8 and 15 Hz. Function: waves originating from the occipital lobe during wakeful relaxation with closed eyes. Alpha waves are reduced with open eyes, drowsiness and sleep.
Mu wave Range: between 7.5 and 12.5 Hz. Function: Mu wave patterns appears when the user performs a motor action or, with practice, when he or she visualizes performing a motor action. Beta wave Range: between 12.5 and 30 Hz. Function: This frequency band represent active, busy, or anxious thinking and active concentration. ERD/ERS patterns can be volitionally produced by motor imagery, which is the imagination of movement without actually performing the movement. ERD: decrease of activity of specific band, ERS: increase of activity of specific band 11 | P a g e
Figure 1-9 ERD/ERS C3 electrode represent activity done by right hand movement imagery. C4 represent Left hand movement imagery. CZ represent foot imagery activity.
Figure 1-10 electrodes location
In contrast to BCIs based on selective attention, BCIs based on motor imagery do not depend on external stimuli. However, motor imagery is a skill that has to be learned. BCIs based on motor imagery usually do not work very well during the first session. Instead, unlike BCIs on selective attention, some training is necessary.
12 | P a g e
13 | P a g e
Chapter 2 Signal Processing and Feature Extraction 2. Signal processing and Feature extraction
14 | P a g e
2.1 Introduction This chapter summarizes the phase before the classification where the noise is removed and the needed spectrum range is extracted, going through the pre-processing step where the data is enhanced in terms of dimensions and artifacts removal and separability. 2.2 Signal processing: 2.2.1 Noise Removal: Common Average Rejection (CAR) CAR is for Common Average Rejection. When an external noise occurs, it is applied to all channels of the EEG so when taking the average of all signals, signals cancel each other but the common noise will not cancel and will remain. So having the noise (average of all channels) you can subtract it from all channels. The noise may represent external electromagnetic waves (220 AC Line), or sudden movement of the head will generate signal across all the channels.
Figure 2-1: Comman Average Rejection 15 | P a g e
2.2.2 Filtering 1. Introduction Filters are used to remove or extract some frequency bands, so why we need to use filters? Because we may want to remove noise (specially the AC line 50Hz frequency) and we may need to get a frequency band which is needed for study like Alpha, Mu, Beta and Gamma bands (stated in chapter 1), you may need other type of filters as mentioned in [6], [7] and explained in the next section. We are using digital filters since we are working on a Digital hardware, Digital filters can be Finite Impulse Response (FIR), Infinite Impulse Response (IIR) and Ideal filters. 2. Filters types a) Low pass filter Used to pass the low frequencies (lower than cutoff frequency) and block the higher frequency. b) High pass Used to pass the high frequencies (higher than cutoff frequency) and block the lower frequency. c) Band pass Used to pass the frequencies of a certain band (higher than Fc1 and lower than Fc2) and block the frequencies outside this band. d) Band stop Used to block the frequencies of a certain band (higher than Fc1 and lower than Fc2) and pass the frequencies outside this band. 16 | P a g e
Figure 2-2: Filter types
3. Implementing Filters: It can be in the form of FIR or IIR filters or ideal. We used in our project a Butterworth band pass filter and an Ideal filter. The main difference between them is that Ideal filter has a sharp edge, but the IIR Butterworth filter is smoother and gets sharper when you increase its order as illustrated in the figure 2-3.
Figure 2-3: Filter order effect 17 | P a g e
In general the IIR filter input is samples in time and the output is the filtered samples in time also, so there is no need for a frequency transform. The main block diagram of the IIR filter as illustrated in chapter 4 in [7] is as follows
Figure 2-4: IIR Filter All you need to construct your filter is to choose the right values for A and B, which you can get it easily using butter function in octave.
Ideal filter implemented in the project by converting the time window to frequency using Fast Fourier Transform (FFT) algorithm as shown in , and picking the needed frequencies, then (if needed) returning the signal to time using Inverse Fast Fourier Transform (IFFT) algorithm, these algorithms are in detail in [8] . So now we can get certain bands to get the needed features
18 | P a g e
2.3 Preprocessing methods The goal of preprocessing step is to apply some methods on the data to ease selecting the most effective feature and ease the analysis on it by removing artifacts and unwanted data that may has a negative effect on data analysis result. These methods are about projecting data in virtual directions by several ways to remove artifacts. Moreover, there are methods that are widely used for enhancing data before extracting features like linear discriminant analysis (LDA) and Principal component analysis (PCA). 2.3.1 Principal Component Analysis (PCA): Principal Component Analysis (PCA) is a classical technique in statistical data analysis, feature extraction and data reduction . Given a set of multivariate measurements, the purpose is to find a smaller set of variables with less redundancy that would give as good representation as possible. The redundancy is measured by correlations between data elements. Therefore, the analysis can be based on second order statistics only. PCA is useful for summarizing variables whose relationships are approximately linear or at least monotonic. Data reduction: Summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. Residual variation is information in A that is not retained in X balancing act between: Clarity of representation, ease of understanding Oversimplification: loss of important or relevant information. Takes a data matrix of n objects by p variables, which may be correlated, and summarizes it by uncorrelated axes (principal components or principal axes) that are linear combinations of the original p variables the first k components display as much as possible of the variation among objects.
PCA algorithm Let D be the dimensionality of the data and M be the dimensionality of the projected space, where M < D. The M principal components are orthonormal Consider the case M = 1, and let the direction of the projected space be u1 where u1 T u1 = 1 . Each data point xn is then projected onto the new principal component space to u1 T xn. 19 | P a g e
Let the mean of the data points be The mean of the projected data is then The covariance of the data points is The variance of the projected data is then The objective of PCA is to maximize the projected variance u1 T Su1 subject to the constraint u1 T u1 = 1. PCA can be formulated then as maximizing the following objective function:
Taking derivative w.r.t. u1 and equate with zero Multiply by u1 T from the left Therefore, maximizing the objective function is equivalent to finding the eigenvector equivalent to the largest eigenvalue. 2.3.2 Linear Discriminant Analysis (LDA) Linear discriminant analysis (LDA) [9] is a well-known feature reduction technique. LDA is used to find a linear combination of features that can better separate two or more classes. The LDA finds such direction a that provide maximum linear separation of classes. There are many possibilities for finding directions but only some are optimal for data discrimination. A measure of data separation can be expressed as the maximum of separation coefficient F (1):
(1)
Where Sm depicts the between-class scatter, Sw within-class scatter. The bigger the value of F (1) the grater probability of classes separation. Let us assume that we have C classes, each containing N observations xi. The measure of within- class scatter S for the class c can be estimated as (2):
(2)
20 | P a g e
Where c indicates mean of the all observations xi for c-th class. Generalization Sw of the within- class scatter for all C classes can be calculated as: (3)
Where ni is the number of xi observations in each class and N is a total number of all observations. The value of between class scatter for class c can be calculated as: (4)
Where indicates the mean of the all observations xi for i-th class and indicates the mean of the all observations xi for all classes. Generalization of between-class scatter Sm for all C classes can be calculated as: (5)
Where ni means the number of xi observations in each class and N is a total number of all observations. It can be proved that directions providing the best class separation are eigenvectors with the highest eigenvalues of matrix S [4]:
(6)
Generally the matrix S is not a symmetric matrix and calculation of eigenvectors can be difficult. This problem can be solved by using generalized eigenvalue problem [5]. A transformed data set can be obtained by:
(7)
Feature reduction by linear discriminant analysis where W=[w1,w2,..,wM] is a matrix build with the M Linear discriminant analysis (LDA) is a well-known eigenvectors of matrix S connected with the highest feature reduction technique. LDA is used to find a linear eigenvalues. LDA reduces 21 | P a g e
the original feature space combination of features that can better separate two or dimension to M. A new data set y is created as a linear more classes. The LDA finds such direction a that provide combination of all input features x with weights W. In maximum linear separation of classes. An example the total number of features is 184. For data projection on directions a and b. There further analysis only two LDA components were taken. There are many possibilities for finding directions but only some data after LDA transformation can be seen in fig. 2-7. There are optimal for data discrimination. Data are easily separable.
Figure 2-5: Data after LDA transform (two components) for the first session of first subject. 2.3.3 Independent Component Analysis (ICA): The more important artifacts in BCIs are generated by muscles and eyes blink. Classical automatic methods for removing such artifacts can be classified into rejection methods and subtraction methods. [10] Rejection methods consist of discarding contaminated EEG, based on either automatic or visual detection can be used in the BCI applications framework. Their success crucially depends on the quality of the detection. Subtraction methods are based on the assumption that the contaminated EEG is a linear combination of an original EEG and other independent artifact signals generated by the muscles and eyes blink. The original EEG is hence recovered by either subtracting 22 | P a g e
separately recorded artifact-related signals from the measured EEG, using appropriate weights or by applying recent approaches for artifacts rejection. The ICA (Common, 1994; Hyvrinen & Oja, 2000) is the more used technique. It is a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals. It is a special case of blind source separation (BSS). ICA is particularly efficient when the EEG and the artifacts have comparable amplitudes. For more details about their advantages, their limitations and their applications for the removal of eyes activity artifacts, refer to (Jung et al., 1998; 2000).While PCA seeks directions that represents data best in a |x0 - x|2 sense, ICA seeks such directions that are most independent from each other. ICA mathematical approach Given a set of observations of random variables x1(t), x2(t)xn(t), where t is the time or sample index, assume that they are generated as a linear mixture of independent components: y=Wx, where W is some unknown matrix. Independent component analysis now consists of estimating both the matrix W and the yi(t), when we only observe the xi(t).
Figure 2-6: ICA estimation example.
2.3.4 Common Spatial Pattern (CSP): The common spatial pattern (CSP) [11] [12] has been proven to be powerful to extract those motor imagery related features. CSP algorithm aims to find directions (i.e., spatial filters) that maximize variance for one class and minimize variance for the other class at the same time. Despite of its popularity and efficiency, CSP is also known to be highly sensitive to outliers, which 23 | P a g e
widely exists in the practical BCI application due to the ocular moment, head motion or the loose contact of electrodes. 2.3.5 PCA vs ICA vs LDA: PCA: Proper to dimension reduction. LDA: Proper to pattern classification if the number of training samples of each class are large. ICA: Proper to blind source separation or classification using ICs when class id of training data is not available.
24 | P a g e
2.4 Feature extraction:
Feature extraction is very important part for this system. It is the result of the data analysis that system sends to the machine learning algorithms that can decide which action should be taken by the system [13]. 2.4.1 Discrete Fourier Transform (DFT) vs Welchs Method: Fourier Transform is the oldest method for analyzing the cycles, i.e. obtaining frequency information. Normally, the Fourier method is used to obtain the magnitude of frequency components. Since we are dealing with discrete-time signals, we will move into discrete Fourier transform (DFT). DFT can be used to perform Fourier analysis of discrete-time signals. For discrete-time signal x[n],
Since = 2 f / Fs and f = mFs / N , we have
While inverse DFT (IDFT) is defined as
Using Eulers relation,
DFT can now be expressed as
25 | P a g e
Welchs Method is used for estimating the power of a signal at different frequencies: that is, it is an approach to spectral density estimation. The method is based on the concept of using periodogram spectrum estimates, which are the result of converting a signal from the time domain to the frequency domain. Welch's method is an improvement on the standard periodogram spectrum estimating method and on Bartlett's method, in that it reduces noise in the estimated power spectra in exchange for reducing the frequency resolution. Due to the noise caused by imperfect and finite data, the noise reduction from Welch's method is often desired. 2.4.2 Applying some features with DFT [14]: 1. Mean After filtering data between the two ranges of frequency that are mainly affected by imagery motor actions Mu and Beta, the mean of Mu and Beta are taken separately for each channel of the 14 channel of the EEG. The mean here is to check activity ERD/ERS of the motor imagery electrodes ex. C3 and C4, which shows the behavior of ERS in certain channel C3 while the ERD in the other channel C4 in Left movement action. 2. Standard Deviation The standard deviation is a statistic that tells you how tightly all the various examples are clustered around the mean in a set of data. When the examples are pretty tightly bunched together and the bell-shaped curve is steep, the standard deviation is small. When the examples are spread apart and the bell curve is relatively flat, that tells you that you have a relatively large standard deviation. By making the standard deviation of Mu and Beta for each channel the targeted feature, we may be able to detect the differences between different signals due to the ERD/ERS phenomena of right and left actions. 3. Minimum and maximum of Mu and Beta frequency bands Depending on the same activity of neurons ERD/ERS phenomena and according fig. 2-8. - Left action causes ERS in C3 and ERD in C4. - Right action causes ERD in C3 and ERS in C4. There was an experiment to find the minimum and maximum point in frequency domain for mu and beta frequency bands that detect a certain points for occurrence of ERD and ERS in each band and take the composite of the results as a feature. But this method suffers over-fitting problem after applying classifiers, as it results in very high accuracies with training data but cannot detect new trials properly. 26 | P a g e
Figure 2-7: ERD and ERS on C3 and C4 in band power 11-13 Hz. 2.4.3 Wavelet Packet Decomposition (WPD): Wavelet packet decomposition (WPD) [15] [16] is extended from the wavelet decomposition (WD). It includes multiple bases and different basis will result in different classification performance and cover the shortage of fixed time frequency decomposition in DWT. The wavelet decomposition splits the original signal into two subspaces, V and W, which are orthogonally complementary to each other, with V being the space that includes the low frequency information about the original signal and W includes the high frequency information. As shown in Fig. 2-9, the decomposition of the low frequency subspace V was repeated. WD only partitions the frequency axis finely toward low frequency, and WPD is a generalized version, which also decomposes the high frequency bands that are kept intact in wavelet decomposition. WPD leads to a complete wavelet packet tree.
Figure 2-8: The structures of wavelet decomposition. 27 | P a g e
Figure 2-9: The structures of WPD.
The decomposition coefficient of j-th level can be obtained by the(j-1)th level, finally we can get the coefficients of all levels through sequential analogy. After it is decomposed by j levels, the frequency ranges of all subspaces at the j-th level (U n j) are
where fs is the sampling frequency.
Extracting features from WPD coefficients *Average coefficients: EEG signal in each channel is calculated according to: , Where j: decomposition level, 2 N : sampling frequency, n: 0 or 1 for high or low frequency. Then the feature vector formed by all channels can be shown as M = {m1, m2, m3, . . . }.
*Sub-band energy: WPD decomposes signal energy on different time frequency plain, and the integration of square amplitude of WPD is proportional to signal power. Like the selection rule of sub-band mean, it selected the sub-band energy (Ej,n) at j-th level whose frequency range is 050 Hz as initial features 28 | P a g e
then the feature vector formed by all channels can be shown as N = {n1, n2, n3, . . .}.
2.4.4 Auto-Regressive model (AR): Auto-regressive (AR) [14] model is a popular linear feature extraction method for biological signals. A real valued, zero mean, stationary, non-deterministic, auto-regressive process of order p is given by:
Where p is the model order, x[n] is the data of the signal at sampled point n, ak are the real valued AR coefficients, and e[n] represents the white noise error term independent of past samples. AR modeling could be seen as a process to obtain an equation that fits the signal (like in curve fitting). Choosing the auto-regressive model order A model order which is too high will over-fit the data and represent too much noise but a model order which is too small will not sufficiently represent the signal. So a compromise has to be made for the model order. There are many methods to compute the model order like Akaike Information Criterion (AIC), Final Prediction Error, Criterion Auto-regressive Transfer, Minimum Description Length etc. After choosing a proper model order p and calculating the AR model coefficients and parameters, AR coefficients can be used as features. 29 | P a g e
Chapter 3 Machine Learning 3. Machine Learning
30 | P a g e
3.1 Motivation Why Machine Learning algorithms? Because you need the system to accept new different inputs and deal with them based on a past experience, so you dont need to program the machine for each specific different case, so it can learn how to act correctly by training it on pre-known data. We may need to clarify some terms, The data which the machine have to know at first is called the Training Data, the new input data is called Test Data, If you are using Machine learning for classification (as in our project) then you need to use a Classifier giving it test data it will decide what class this test data belong to. A classification example may help. For example a fruit factory needs a machine to classify Apples into two quality classes, for example class A and class B, then the machine needs some features which really characterize each class (the main goal of Chapter 2), we need these features to be separable as possible to have realistic decisions from the machine, for example if we put the features as (color and weight), so we can represent each single apple single trial with a vector of two dimensions having two values apple = [ color , weight], to visualize the apples data we can plot it in a graph.
The part a from the figure can be classified with good accuracy since the features really represent each class and can separate them, the separable data may have other forms, in part b these features do not fairly represent each class (both share same features) so features need to be changed (volume, height, ..) so the data get separable, so you can start classification.
weight Class B Class A color
weight color a b Figure 3-1: a) Class A apples are the filled circles and Class B are the hollow circles, Class A and Class B are separable, b) apples of Class A and Class B is totally not separable 31 | P a g e
3.2 Machine Learning Types 3.2.1 Supervised Learning As in the example in section 1.1, you provide the classes data, all data are labeled, which means for every trial (single point) you know its class, based on this information you start constructing your classifier. This is useful in case you know all classes of all data and just need to get the class label of the new test data. In our project, we used supervised learning to know the class label of the new action in the known set of action classes (right, left, forward, neutral). 3.2.2 Unsupervised Learning This type of learning knows nothing about the classes, it starts from the data then finds out all possible classes, this figure may help
This type is used in case we dont know all possible classes, an example is grouping the news making all news related to a topic in a class, another one is used in grouping individuals genes.
X2 X1 X2 X1 a b Figure 3-2: a) all training data without any class labels, b) data classified into 3 classes. 32 | P a g e
3.2.3 Reinforcement learning Is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. The problem, due to its generality, is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, statistics, and genetic algorithms. In the operations research and control literature, the field where reinforcement learning methods are studied is called approximate dynamic programming. The problem has been studied in the theory of optimal control, though most studies there are concerned with existence of optimal solutions and their characterization, and not with the learning or approximation aspects. In economics and game theory, reinforcement learning may be used to explain how equilibrium may arise under bounded rationality.
33 | P a g e
3.3 Classifiers Types 3.3.1 Linear Classifiers This type of classifier get two classes separated by a liner function (line in 2D data), such that the data at one side belong to a class, and the other side belong to another class. As shown in figure 3.3 you may get more than one possible solution, for example line H1 and line H2 can classify correctly, H3 is bad classifier since it will misclassify a lot of data. So its important to use a good method for your data, some methods are simple but may fail in some cases (if your data contains outliers), this will be explained in further sections in this chapter.
Figure 3-3: In this case, the solid and empty dots can be correctly classified by any number of linear classifiers. H1 (blue) classifies them correctly, as does H2 (red). H2 could be considered "better" in the sense that it is also furthest from both groups. H3 (green) fails to correctly classify the dots.
34 | P a g e
3.3.2 Non-linear Classifiers The classification depends on non-linear function such as curves or even probability distributions of each class, these methods are preferred to use when your data is separable but having complicated boundary for a linear classifier, this is an example
They may get to higher order for more complex boundary like this
Figure 3-5: Non-linear classifier of high order with separable data. This type will be explained in further sections. X2 X1 Figure 3-4: Non-linear classifier with separable data, this data cant be fairly classified using linear classifier. 35 | P a g e
3.3.3 Which to Use? Linear classifier are less in complexity and may offer less time cost but they fail in some cases so you will need non-linear classifiers with higher time and processing cost but with better accuracy. Features are very important and must be chosen wisely, for example adding an extra feature (more dimension to the feature vector) may allow you to use a linear classifier instead of non- linear one this may raise accuracy and save some processing time.
Figure 3-6: with adding one more feature the data can be separated with a linear classifier (plane).
36 | P a g e
3.4 Linear Classifiers: 3.4.1 Least-squares classifier: Analogous to regression: simple closed- form solution exists for parameters. Each Ck , k=1,..K is described by its own linear model yk(x) = w T k x + wk0. Create augmented vector x=(1,x T ) and wk=(wk0,w T k). Grouping into vector notation y(x) = w T x. W is the parameter matrix whose kth column is a D +1 dimensional vector (including bias). New input vector x is assigned to class for which output yk = w T k x is largest. Determine W by minimizing squared error. Least Squares: Model predictions closely to target values.
Disadvantages of Least Squares Lack robustness to outliers. Certain datasets unsuitable for least squares classification. Decision boundary corresponds to ML solution under Gaussian conditional distribution, but binary target values have a distribution far from Gaussian. 3.4.2 Fishers Linear Discriminant Classifier: View classification in terms of dimensionality reduction. Project D-dimensional input vector x into one dimension using y = w T x Place threshold on y to classify y w0 as C1 and otherwise C2. We get standard linear classifier. Classes well-separated in D-space may strongly overlap in 1-dimension o Adjust component of the weight vector w. o Select projection to maximize class-separation. Fisher: Maximize class separation.
37 | P a g e
Fisher algorithm Class Means
Class Variance
Goal: Maximize separation while minimizing the within-class variance Simplest measure of separation is the separation between the means Within-class variance can be approximated as the summation of the variances of both classes So, Fishers solution is
3.4.3 Perceptron Classifier: Two-class model Input vector x transformed by a fixed nonlinear function to give feature vector (x). y(x) = f (w T (x)) Where non-linear activation f (.) is a step function () = { +1, 0 1, < 0
Use a target coding scheme t = +1, for class C1 and t = 1 for C2 matching the activation function. Perceptron Error Function Error function: number of misclassifications. This error function is a piecewise constant function of w with discontinuities (unlike regression). Hence no closed form solution (no derivatives exist for non smooth functions). 38 | P a g e
The algorithm Cycle through the training patterns in turn: o If incorrectly classified for class C1 add to weight vector. o If incorrectly classified for class C2 subtract from weight vector. Disadvantages of Perceptrons Does not converge if classes not linearly separable. Does not provide probabilistic output. Not readily generalized to K >2 classes.
Perceptron algorithm (X) : Feature vector (with a bias component ) (.) : Activation function = t {-1, +1}
Goal: Find w such that if and if Or
Perceptron Criterion For correctly classified patterns, error = 0 For misclassified patterns, minimize the quantity
is always positive. Consider a 1-dimension w: 39 | P a g e
Where is the learning rate parameter. 3.5 Non-linear Classifiers: 3.5.1 Maximum Likelihood Discriminant Classifier: A maximum likelihood classifier (ML) chooses the class that makes the chance of the observations the highest Maximum Likelihood Approach for Two classes Data set {n,tn} where tn {0,1} and n=(xn), n=1,..,N Since tn is binary we can use Bernoulli Let yn be the probability that tn =1 Likelihood function associated with N observations
Where t =(t1,..,tN)T and yn= p(C1|n)= (w T n) 40 | P a g e
3.5.2 KNN: Based on a measure of distance between observations (e.g. Euclidean distance or one minus correlation). k-nearest neighbor rule (Fix and Hodges (1951)) classifies an observation X as follows: Find the k closest observations in the training data. Predict the class by majority vote, i.e. choose the class that is most common among those k neighbors. k is a parameter, the value of which will be determined by minimizing the cross- validation error later. Advantages of the NN Algorithm The N algorithm can estimate complex target concepts locally and differently for each new instance to be classified. The NN algorithm provides good generalization accuracy on many domains. The NN algorithm learns very quickly. The NN algorithm is robust to noisy training data. The NN algorithm is intuitive and easy to understand which facilitates implementation and modification. Disadvantages of the NN Algorithm The NN algorithm has large storage requirements because it has to store all the data. The NN algorithm is slow during instance classification because all the training instances have to be visited. The accuracy of the NN algorithm degrades with increase of noise in the training data. The accuracy of the NN algorithm degrades with increase of irrelevant attributes. KNN algorithm: Algorithm steps: Step 1: Randomly choose clusters center k Step 2: Compute
(Assign xn to the cluster with closest center)
41 | P a g e
Step 3: Update k
Take derivative of J w.r.t. k and equate with zero
Back to Step 2 until convergence
42 | P a g e
43 | P a g e
Chapter 4 Training, Detection and Analysis 4. Training, Detection and Analysis
44 | P a g e
4.1 Introduction In this chapter we will talk about the main programs developed for the Braingizer project the trainer, the detector and the analyzer. The trainer program is used to record training data from the user and store it in a training file. This training file is then used by the detector program to detect the users actions to move the wheelchair in this direction and this detection session is then stored in a detection file. Both the detection and the training files are used by the analyzer to select the best learning technique that provides the highest accuracy to be used later in the online detection as the user drives his wheelchair.
Figure 4-1: Braingizer's main programs
45 | P a g e
4.2 Braingizer Trainer 4.2.1 Introduction Braingizer-Trainer is the program that is used to collect the raw data from the user during a training session while performing certain actions. These collected data are then used to train the machine learning classifiers with the detector programs as will be discussed later in this chapter.
Figure 4-2: A volunteer is performing a training session.
Figure 4-3: The training program. 46 | P a g e
4.2.2 How to use? 1. In the trainer folder, run the shell script file name run.sh.
Figure 4-4: Shell-script run.sh output. 2. Connect the Emotiv EPOC headset to the PC and press the connect button.
Figure 4-5: Connect button. 3. Write the training-session details and the subjects details and then press the Start Training button.
Figure 4-6: Trainer Main Window. 47 | P a g e
4. A welcome screen will appear. The user must sit in a comfortable position and be ready for the training session before pressing the start button.
Figure 4-7: Welcome Screen. 5. The session consists of number of trails, in each trail a blue arrow will appear in a random direction and the user is asked to perform an action in this direction.
Figure 4-8: Random blue arrow. 6. After the detection-session is finished, the session will be saved as a CSV file to be analyzed later.
48 | P a g e
4.2.3 Application Architecture 1. Components The program was developed in Python over Ubuntu OS and it uses Open-Source components in its structure. The main three components are: 1. PyQT: For designing and building the GUI. 2. Python-emotiv: For connecting to the Emotiv-EPOC headset without needing the SDK, this was important as the SDK was not released for the ARM architecture and it wouldnt have worked for the Rasperry-Pi or the Radxa boards. 2. Structure The program collects the data from the Emotiv-EPOC headset using Python-emotiv while showing arrows in random direction for the user. These data are then saved to a file to be used later with the detection programs.
Figure 4-9: Application Structure. 4.2.4 Design Aspects 1. Configurability The training program is configurable using the AppConfig.py file: 1. The user can choose a default profile and a default session to ease the recording process. 2. The user can change the paths of the classifiers, the training and the detection folders by setting its path in the AppConfig.py file. 3. The programmer can enable the debug mode by setting the Debug flags to True. 2. Multi-Threading The program is designed to run in multi-threads, the main threads are: 1. The threads of reading and storing the Emotiv-EPOC data. 2. The GUI thread. And all the threads are connected to each other using event handlers and queues. 49 | P a g e
4.2.5 Training file format The training session is saved to a CSV file to be analyzed later. The CSV file consists of two parts: 1. The header: which has all the session details including the the classes names and the class of each trial. 2. The Raw-Data: which contains the recorded data from the headset during the session.
Figure 4-10: The CSV file for a training session.
50 | P a g e
4.3 Braingizer Detector 4.3.1 Introduction Braingizer-Detector is the program that is used in decoding the users brain signals into actions. The program starts by training one of the machine learning classifiers using a training file recorded from the user. This trained classifier is then used to detect the users actions and send it to the wheelchairs motors or to the arrows on the screen. The detection-session is saved into a CSV file to be analyzed later. Although this program can work on the wheelchair, its main function was to collect the detection data from the users to be analyzed later and to measure the detection accuracy rather than actually moving the wheelchair.
Figure 4-11: The Detector Program. 51 | P a g e
4.3.2 How to use? 1. In the detector folder, run the shell script file name run.sh (or radxa_run.shif you are running it on the radxa board over SSH). 2. Connect the Emotiv EPOC headset to the PC (Or to the Radxa board) and press the connect button. 3. Select the training file, the classifier name and the detection-session parameters and then press the Start Detection button.
Figure 4-12: Detector Main-Window.
4. A welcome screen will appear. The user must sit in a comfortable position and be ready for the detection session before pressing the start button.
52 | P a g e
Figure 4-13: Welcome Screen. 5. When the session is started, a blue arrow will appear on the screen and the user is asked to perform an action in this direction. This action is then detected and will appear as a red arrow if its wrong or a green arrow if its true.
Figure 4-14: The user is asked to perform a right action.
53 | P a g e
6. The accuracy is automatically updated after each trial and will appear as a progress-bar at the top of the window.
Figure 4-15: Accuracy bar. 7. After the detection-session is finished, the session will be saved as a CSV file to be analyzed later. 4.3.3 Connect to the wheelchair To enable the control of the wheelchairs motors: 1. Open the AppConfig.py file in the detector folder and change the EnableMotors flag to true. 2. Connect the PC to the motor driver using a USB cable. 3. Start the detector program, and at the beginning the program will search for the motors.
Figure 4-16: The program is searching for the motors over ports. 54 | P a g e
4. Follow the startup steps as discussed in the previous section and each time the program will detect an action the wheelchair will move in this direction. 4.3.4 Run on the Radxa board 1. Connect to the board over SSH by typing sshrock@<ipaddress>. 2. Navigate to the detector folder on the board and enable the motors if needed by following the steps in the previous section. 3. Its recommended to use the auto-run mode on the board, by setting the Autorun flag in the AppConfig.py to True and writing the session details, the classifier name and the training file path. 4. Run the radxa_run.sh shell-script to start the program. 5. The program will run on the wheelchairs screen and if the motors are enabled the chair will move in the detected direction. 4.3.5 Application Architecture 1. Components The program was developed in Python over Ubuntu OS and it uses Open-Source components in its structure. The main three components are: 3. PyQT: For designing and building the GUI. 4. Python-emotiv: For connecting to the Emotiv-EPOC headset without needing the SDK, this was important as the SDK was not released for the ARM architecture and it wouldnt have worked for the Rasperry-Pi or the Radxa boards. 5. Oct2Py: Which is used to connect the Detector program to the Octave program to run the machine learning classifiers implemented in octave (which is an Open-Source alternative to Matlab) as discussed in the previous chapter. 2. Structure The program collects the data from the Emotiv-EPOC headset using Python-emotiv. These data are then passed to the machine learning classifiers to detect the users action. The detected action is then passed to the wheelchairs motors using the motor class and appears on the screen as an arrow in the detected direction. 55 | P a g e
Figure 4-17: Application Structure. 4.3.6 Design Aspects 1. Extendibility The detection programs are designed to be extendable: 1. The user can add any new machine learning classifiers in the classifiers folder and just select it using a dropdown menu.
Figure 4-18: Selecting Classifier Script. 2. The programs can be ported to any Linux-based machine from the PC to the Embedded- Linux-Boards like Raspberry-Pi and Radxa. 2. Configurability The detection program is configurable using the AppConfig.py file: 1. The user can enable the motors by setting the EnableMotors flag to True. 2. The user can enable the auto-run mode by setting the Autorun flag to True and set the auto-run parameters (Classifier name, Training File name etc.). 56 | P a g e
3. The user can change the paths of the classifiers, the training and the detection folders by setting its path in the AppConfig.py file. 4. The programmer can enable the debug mode by setting the Debug flags to True. 3. Multi-Threading The program is designed to run in multi-threads, the main threads are: 1. The threads of reading and storing the Emotiv-EPOC data. 2. The thread of the detection which connects to the Machine Learning classifiers over Oct2Py. 3. The GUI thread. And all the threads are connected to each other using event handlers and queues. 4.3.7 Detection file format The detection session is saved to a CSV file to be analyzed later. The CSV file consists of two parts: 1. The header: which has all the session details including the training file, the classifier name, the classes names and the class of each trial. 2. The Raw-Data: which contains the recorded data from the headset during the session.
Figure 4-19: Detection CSV file.
57 | P a g e
4.4 Braingizer OnlineDetector 4.4.1 Introduction Braingizer-OnlineDetector is the asynchronous version of the Braingizer-Detector, it is used to decode the users action continuously without triggers (asynchronous detection). The program uses a sliding window to continuously detect the users actions and this sliding window is delayed with the detection time which is around 0.15 seconds on the PC and 0.3 seconds on the radxa board. This program is used mainly on the wheelchair with the Radxa board, although it can normally work on the PC with or without the wheelchair. 4.4.2 Run on the Radxa board 1. Connect to the board over SSH by typing sshrock@<ipaddress>. 2. Navigate to the OnlineDetector folder on the board. 3. Enable the motors by changing the EnableMotors flag to true in theAppConfig.pyfile. 4. Enable the auto-run mode by setting the Autorun flag in the AppConfig.py to True and writing the session details, the classifier name and the training file path. 5. Connect the motor driver to the board. 6. Connect the Emotiv-EPOC to a USB port on the board. 7. Run the radxa_run.sh shell-script to start the program. 8. The program will run on the wheelchairs screen and if the motors are enabled the chair will move in the detected direction. 58 | P a g e
4.5 BAnalyzer 4.5.1 Motivation BAnalyzer (Braingizer Analyzer) is a software developed to provide useful statistical analysis for different BCI training techniques and training sessions. Developing bAnalyzer was started to solve the problem of having 240 different ways, and increasing, by which we could train and later detect the user's signal. By that, our target became more than just learning and detecting the user's input, but rather to select the best learning technique that provides us the highest accuracy to be used later in the online detection as the user drives his wheelchair. 4.5.2 Installing 1. Install Mandatory Programs Install git and python.
Figure 4-20 install git and python 2. Checkout bAnalyzer and Install its dependencies Install all the dependencies by running ./Install.sh.
Figure 4-21 install dependancies 3. Setup online logging into a Google spread sheet Create a file named "GooglespreadSheetConfig.py" and place it in "/Configurations" folder the file should look like this: email = 'YourEmail' password = 'YourPassword' title = 'Spreadsheet name' url = 'Spreadsheet URL' sheet1_title ='working sheet name' 59 | P a g e
4.5.3 Usage BAnalyzer is currently used in training method investigation, feature investigation and training session cross validation, in this section we'll go through using each of them. BAnalyzer will start compiling the python code and generating the GUI
Figure 4-22 run bAnalyzer 1. Training Method Investigation In this mode we supply the program with all the different methods we want the data to be classified according to alongside with the detection and training sessions, and eventually get statistics about the most efficient processing, feature extraction and classification methods. We could do that on a single file at a time or in a bulk mode. a) Single Mode:
Figure 4-23 Main windows GUI 60 | P a g e
Procedure Train the classifier according to the selected options.
Figure 4-24 SelectionTraining path Training output when Training is Done
Figure 4-25 generating bAnalyzer GUI
61 | P a g e
Detect user decision using the selected classifier.
Figure 4-26 Selection detection mode Detection output after Classification for the single mode:
Figure 4-27 single mode result
62 | P a g e
b) Bulk Mode:
Figure 4-28 Bulk mode GUI You can select desired methods
Figure 4-29 select bulk paths And select weather to update google spreadsheet or not
Figure 4-30 check weather to update Goodle spread shoot or not 63 | P a g e
Results from Google spread sheet
Figure 4-31 bulk mode result 2. Feature Investigation This mode is for investigating certain features, like mu and beta bands, in the training or detection sessions.
Figure 4-32 features preview screen
Result 64 | P a g e
Figure 4-33 features plot Procedure From the main window, select preview features, Select wanted frequencies to be plotted and starting/ending time.
Figure 4-35 features navigation GUI Figure 4-34 features main UI 65 | P a g e
3. Training Session Cross Validation Here we determine the validity of the training sessions, we either train and detect using the same file or use the 20-80 test. The 20-80 test is done by splitting 80% of the data for training and 20% for testing, then comparing the results with the actual data.
Figure 4-36 select same file data for detection You can select same file or any offset from the data in this file to be used as detection data
Figure 4-37 select ALL Figure 4-38 select certain offset 66 | P a g e
Results
Figure 4-39 cross validation 4. Configuring BAnalyzer BAnalyzer options are also configurable through multiple configuration files used as user preferences.
Figure 4-40 configuration files 4.5.4 Statistical Results By using bAnalyzer with the different detection and training sessions we're able to capture too many statistics for each action, those results are stated in detail in Chapter 6. See results are stated in detail in Chapter 6. 4.6 Linux SDK In order to setup the headset we need to ensure each electrode has good signal quality, thus we use the Emotiv SDK for that. In case of poor signal we should either add more saline for connectivity or adjust the electrode placement. 67 | P a g e
Chapter 5 Wheelchair System 5. Wheelchair System
68 | P a g e
5.1 Embedded Linux 5.1.1 Introduction Linux kernel has been ported to a variety of CPUs which are not only primarily used as the processor of a desktop or server computer. Due to its high customizability and low cost, Linux has been used in lots of daily used devices, from smart phones to spacecraft flight software. Braingizer used Embedded-Linux to get rid of the need of a PC on the wheelchair. We started using a Raspberry-Pi board that didnt satisfy our needs and ended with using Radxa-Rock board which came with great speed results in detecting the users actions. 5.1.2 Raspberry-Pi Board The Raspberry Pi is a credit-card-sized single-board computer developed in the UK by the Raspberry Pi Foundation with the intention of promoting the teaching of basic computer science in schools. 1. Specifications CPU: 700 MHz ARM1176JZF-S core (ARM11 family, ARMv6 instruction set). Memory: 512 MB (shared with GPU).
Figure 5-1: Raspberry-Pi Board. 2. Speed Test The Raspberry-Pi failed to satisfy our needs for detecting the users actions in real-time, it took about 11-minutes for training and around a minute for each new trial detection which is inapplicable with our solution, and has to be changed to a higher board, which is the Radxa- Rock. 69 | P a g e
Figure 5-2: Radxa-Rock Board. 2. Speed Test Due to the quad-core Cortex A9 ARM processor, the Radxa-Rock board came with great results. The training section ran in average 11-seconds and the detection for each new trial took around 0.30 seconds, which is very good results even compared to the PC which take around 0.15 second in detection.
70 | P a g e
5.2 Control 5.2.1 Arduino with servo motor A servo motor is a rotary actuator that allows for precise control of angular position using internal angle controller circuit, sending the angle value through one signal wire will make the servo motor go to that angle and stay there.
Figure 5-3: Servo motor connections. First we connect the Arduino with the servo as shown in Figure 5-3 to control the servo motor and give us the advantage of covering the front, right and lift direction by swing the motor from 0 to 180 degree and back to 0 in a loop. 5.2.2 Arduino with Ultrasonic Ultrasonic sensors generate high frequency sound waves and evaluate the echo which is received back by the sensor. Sensors calculate the time interval between sending the signal and receiving the echo to determine the distance to an object. We connect the sensor to the Arduino board as shown in figure 2 and we manipulate the Vcc and ground to be able to connect the Ultrasonic sensor and the servo motor to the Arduino at the same time. 71 | P a g e
Figure 5-4: Ultrasonic connections. Then we fix the Ultrasonic sensor to the servo motor as shown in figure 5-5
Figure 5-5: Fixing the ultrasonic to the servo-motor. The Ultrasonic can measure distance in range of 3m with a good accuracy. To find any obstacles we do the following. Defining each direction with a range of angles (for example left from angle 0 to angle 30), getting the reading from the sensor for each angle in that direction and calculating the average of these values will give a sense of the obstacles in that direction, i.e if the average is less than certain number so we can tell there is an obstacle. then disable the motor of the chair to avoid that obstacle. 5.2.3 Arduino with Drivers In the other hand we use a Driver (H-Bridge) to control the motor speed and direction of each single motor, The driver receive the speed and direction from the Arduino which take orders from the Radxa board through the UART module. 72 | P a g e
lets have a look of what the driver do. An H bridge is an electronic circuit that enables a voltage to be applied across a load in either direction (both polarities), by defining two signals which are Input1 and input2, this truth table shows the action of directions in case the Enable signal is High. Table 5-1: H-Bridge truth table:
Enable pin is used to apply PWM which control the motors speed, in our project we used timer2 in the ATMEGA328P to generate a PWM which is suitable to the motor response, the value of the duty cycle (represented in OCR2A register) control the output voltage which in turn control the motor speed .
Figure 5-6: PWM As the duty cycle increase the power delivered to the motor increase allowing to go faster. The Arduino take the reading of the Ultrasonic sensor which will till us there is no obstacle then it will give the signal to the Driver witch will force the motor to move right.
73 | P a g e
5.3 Power Supply 5.3.1 Power supply & charging circuit and connections To be able to move the chair we used a lead acid battery of 12 V 12 AH
Figure 5-7: 12-volts Lead-Acid Battery. The battery is fixed to the chair and they are connected to the supply voltage of the driver (H- Bridge) which control the motors. The battery is charged using a solar charging unit which takes the terminals of the solar panel as input and the terminals of the battery, and gives the output two terminals for the load. Its function is to organize the charging and protect the battery in case of low voltage state to prevent drawing more current, its like that one in the picture
Figure 5-8: Solar Charge Control. This system is modified to charge the battery from the AC line, using a simple circuit made of relays and transformers so that the RMS 220 AC is transformed to RMS 12 AC then passed to a full wave rectifier and then to capacitor to get a uniform 122 DC volt, this is used as an input to 74 | P a g e
the solar charge controller so it think as if there is a solar panel directed to the sun with continuous and uniform flux connected to it and generating 122 DC volt. A simple circuit of relays is made to connect the charging input to the panel as long the AC line is disconnected and connect the AC line to the charger controller when connected to it.
75 | P a g e
Chapter 6 Project Results 6. Project Results
76 | P a g e
6.1 Introduction Training the wheelchair could be achieved through enormous number of ways. The ways we used throughout the course include: varying the duration of training samples, tilting the headset to probe signals from the brains C3 and C4 directly, introducing muscle movement, using a continuous muscle movement instead of a momentary one, facial expressions, increasing/decreasing the number of classes, using imagination only or using mathematical operations. In this chapter we shall introduce those methods with the acquired accuracies from each using the bAnalyzer software introduced in (Section 4.4). By the end of the chapter we shall draw some conclusions about those results. 6.2 Methodologies In this section some used methods will be stated with the accuracy results obtained from each. 6.2.1 Default Setup Each training trial the subject is told to do some action for 7 seconds only 4 seconds, by default, from them were processed. The environment of the training and detection were, as much as possible, kept the same with minimal amount of distractions. In the upcoming sections decisions are stated with the user action triggering them. All the training methods included removing noise in preprocessing and using the mean value for Mu and Beta as features for the 14 electrode channels. Different training paths are stated in (Table 1). Table 6-1: Filter, Feature enhancement method path map Path name Filter name Feature enhancement method Path 0 Butter LDA Path 1 Butter PCA Path 2 Butter None Path 3 Ideal LDA Path 4 Ideal PCA Path 5 Ideal None 77 | P a g e
6.2.2 Limb Movement The subject moves his hands either right or left. Table 6-2: Limb movement action map Mapped action Subject action Right Moving right hand Left Moving left hand
32 trials of 1 st Subject: Table 6-3: Limb movement 1st subject results Classifier Path 0 Path 1 Path 2 Path 3 Path 4 Path 5 Min Max Avg. Least 43.8 28.1 43.8 46.9 56.3 46.9 28.1 56.3 44.3 KNN 46.9 40.6 50.0 50.0 31.3 50.0 31.3 50.0 44.8 Like 43.8 28.1 37.5 46.9 56.3 37.5 28.1 56.3 41.7 Path Avg. 44.8 32.3 43.8 47.9 47.9 44.8 29.2 54.2 43.6
20 trials of 2 nd Subject: Table 6-4: Limb movement 2nd subject results Classifier Path 0 Path 1 Path 2 Path 3 Path 4 Path 5 Min Max Avg. Least 30.0 25.0 30.0 40.0 45.0 40.0 25.0 45.0 35.0 KNN 35.0 30.0 50.0 35.0 25.0 45.0 25.0 50.0 36.7 Like 30.0 25.0 30.0 40.0 45.0 35.0 25.0 45.0 34.2 Path Avg. 31.7 26.7 36.7 38.3 38.3 40.0 25.0 46.7 35.3 6.2.3 Motor Imagery The subject thinks only about the right leg moving action without executing it. Table 6-5: Motor imagery action map Mapped action Subject action Right Imagine only right leg moving Left Left hand stress on a ball Neutral Do nothing
6.2.4 Mathematical Operations The subject is required here to think about some mathematical operations all over the right class detection time. Table 6-8: Mathematical operations action map Mapped action Subject action Right Solving a mathematical operation Left Moving left hand Neutral Do nothing
Moving muscles on real stress balls Table 6-13: Continuous muscle movement 2nd action map Mapped action Subject action Right Continuous right hand stressing on a ball Left Continuous left hand stressing on a ball Neutral Do nothing
80 | P a g e
20 trials of the 1 st subject Table 6-14: Continuous muscle movement 1st subject results, 2nd action map Classifier Path 0 Path 1 Path 2 Path 3 Path 4 Path 5 Min Max Avg. Least 73.3 73.3 73.3 80.0 60.0 80.0 60.0 80.0 73.3 KNN 73.3 73.3 60.0 80.0 73.3 73.3 60.0 80.0 72.2 Like 66.7 60.0 60.0 73.3 6.7 40.0 6.7 73.3 51.1 Path Avg. 71.1 68.9 64.4 77.8 46.7 64.4 42.2 77.8 65.6
20 trials of the 2 nd subject Table 6-15: Continuous muscle movement 2nd subject results, 2nd action map Classifier Path 0 Path 1 Path 2 Path 3 Path 4 Path 5 Min Max Avg. Least 60.0 60.0 60.0 46.7 53.3 46.7 46.7 60.0 54.4 KNN 60.0 60.0 46.7 46.7 46.7 40.0 40.0 60.0 50.0 Like 53.3 33.3 53.3 53.3 26.7 33.3 26.7 53.3 42.2 Path Avg. 57.8 51.1 53.3 48.9 42.2 40.0 37.8 57.8 48.9 6.2.6 Facial Expression Table 6-16: Facial expression action map Mapped action Subject action Right Right mouth clinching Left Left mouth clinching Neutral Do nothing Forward Right eye blinking
6.2.7 Headset Tilting We tilted the headset to reach nearer places to the C3 and C4 brain areas. Table 6-19: Headset tilting action map Mapped action Subject action Right Right hand moving Left Left hand moving
10 trials of 2 nd subject: Table 6-21: Headset tilting 2nd subject results Classifier Path 0 Path 1 Path 2 Path 3 Path 4 Path 5 Min Max Avg. Least 60.0 60.0 60.0 20.0 20.0 20.0 20.0 60.0 40.0 KNN 50.0 50.0 50.0 50.0 70.0 40.0 40.0 70.0 51.7 Like 60.0 60.0 70.0 70.0 70.0 60.0 60.0 70.0 65.0 Path Avg. 56.7 56.7 60.0 46.7 53.3 40.0 40.0 66.7 52.2 82 | P a g e
6.2.8 Varying Detection Duration Here we used only 2 seconds as our trial duration for detection instead of 4 as usual. Table 6-22: Varying detection duration action map Mapped action Subject action Right Right mouth half clinching Left Left mouth half clinching Neutral Do nothing Forward Whole mouth clinching
6.2.9 Summary Summary of the previous methods. Table 6-24: Summary of different methods Section Max Avg. Min Class # Subject # Train trials/Class Detection Trials Trial Time Best Path 1.2.2 50 39 27 2 2 50 20-32 4 Path 4, Least Squares 1.2.3 44 34 22 3 1 15 15 4 Path 5: Likelihood 1.2.4 63 48 33 3 2 25 30 4 Path 3: Likelihood 1.2.5 48 38 31 3 1 25 15 4 Path 3: Likelihood 1.2.5 67 57 40 3 2 50 15 4 Path 0: Likelihood 1.2.6 92 70 47 4 2 15 20 4 Path 1: Likelihood 1.2.7 70 54 41 2 2 50 10 4 Path 1: Least Squares 1.2.8 98 79 48 4 1 15 20 2 Path 4: Least Squares 83 | P a g e
6.3 Conclusion
From the previous tables we can conclude that both the likelihood and least squares prove being more efficient than the KNN. The facial expressions are showing the highest accuracies even for 4 classes, which indicates that the facial expressions are highly separable. Decreasing the duration time shown us an increase in accuracy with the facial expression which indicates that spending more time for the detection trial isnt always a good decision to increase the accuracy. We can see that the motor imagery gives poor results too, which we believe would increase with getting a higher quality headset. We also can find that using a real object in the continuous muscle movement produces more accurate results than imagining the object.
84 | P a g e
85 | P a g e
References
[1] [Online]. Available: https://github.com/RaniaRho/bAnalyzer. [2] [Online]. Available: http://www.bbci.de/competition/iii/#datasets. [3] J. Enderle, S. Blanchard and J. Bronzino, "Introduction to Biomedical Engineering". [4] B. Graimann, B. Z. Allison, G. Pfurtscheller, "Brain-Computer Interfaces," in Brain-Computer Interfaces: A Gentle Introduction. [5] [Online]. Available: http://blog.neuroelectrics.com/blog/bid/237205/Event-Related- Potential-Our-Brain-Response-To-External-Stimuli. [6] Alan V. Oppenheim, Alan S. Willsky, with S. Hamid, Signals and Systems 2nd edition. [7] John G. Proakis, Dimitris K Manolakis, Digital Signal Processing principles, algorithms, and applications. [8] Naylor, Dr. Patrick A, "Digital Signal Processing slides". [9] Marcin KOODZIEJ, Andrzej MAJKOWSKI, Remigiusz J. RAK, Linear discriminant analysis as EEG features reduction. [10] Ehsan Tarkesh Esfahani , V. Sundararajan, "Classification of primitive shapes using brain computer interfaces". [11] Li-Chen Shi1, Yang Li1, Rui-Hua Sun1, and Bao-Liang Lu1,2,*, A Sparse Common Spatial Pattern Algorithm. [12] Peiyang Li, Peng Xu*, Rui Zhang, Lanjin Guo and Dezhong Yao, L1 Norm based common spatial patterns. [13] ABDUL-BARY RAOUF SULEIMAN, TOKA ABDUL-HAMEED FATEHI, "FEATURES EXTRACTION TECHNIQES OF EEG SIGNAL". [14] Palaniappan, Ramaswamy, Biological Signal Analysis. [15] Wu Tinga, Yan Guo-zhenga,Yang Bang-huaa,Sun Hong, "EEG feature extraction based on wavelet packet decomposition for brain computer interface". [16] Wei-Yen Hsu, Yung-Nien Sun, "EEG-based motor imagery analysis using weighted wavelet". 86 | P a g e