1 Introduction

Classification of EEG signals is a crucial task in every brain-computer interface (BCI), allowing for accurate and low latency interaction between a disabled person and a computer application [7, 16, 48]. Electroencephalography is a non-invasive method for monitoring of brain activity [17], whereas applying dedicated method of signal processing facilitates reasoning about mental condition, emotional state, as well as motion intents. Many reported experiments were aimed at recognition of so called imaginary motion, usually unilateral, i.e. of left or right hand. Such detection can be employed for paralysed or locked-in-state persons for steering a motorized wheelchair [8, 11] or computer applications [4, 10, 2326, 33, 34, 47].

The classification of motion intent can be performed following two disjoint paradigms of synchronous and asynchronous systems. The former one involves flashing an icon on the screen in strictly timed intervals and verifying by means of the P300 potential if the person is focusing at this icon [5, 6, 18, 35]. Varying the flashing pattern for each icon is useful in applications with multiple choices. The latter approach allows for self-paced interaction, but it requires determining two states: non-control and control state, and then in the latter case classifying the type of control [9, 39, 50]. The asynchronous approach is evaluated in this work, the method and results of classification of left and right, and up and down motion intents are assessed, and possibility for control of a multimedia applications is discussed.

There are many important factors hampering signal acquisition and classifications in BCI applications. Published practical research is concerning the subject’s mental fatigue often leading to low classification accuracy since the person is not able to concentrate on the task [41]. Other aspects are electrodes positioning, skin conductance, hair thickness, which can be dealt with by hardware solutions such as: an electrode type, tight mounting cap, electrolytic gels, etc. On the other hand, a muscle electric activity of eye movements, blinks, heartbeat are present in the signal as artefacts, that they can be eliminated only employing signal processing methods [1, 12, 14, 52].

In this research the rough set based method for the analysis of EEG data of 106 persons taken from an open database is presented. It is shown that the classification accuracy varies from person to person, depending on the mentioned issues, but it may exceed 80% (corresponding to more than 10,176 correctly classified cases in a testing set of 12,720 cases), for a particular imaginary motion task.

2 EEG signal processing approaches

The common approach to EEG classification is based on the analysis in frequency bands, that are proven to be related to various types of mental and physical conditions [40, 49]. For example, delta waves (2-4 Hz) are related to consciousness and attention, theta (4-7 Hz) and alpha (8-15 Hz) reflect thinking, focus, and attention. Moreover, by correctly positioned electrodes each functional part of the brain can be monitored separately, e.g. motor cortex or visual cortex, thus providing spatial partitioning of recorded signals, and in turn simplifying creation of dedicated processing and classification for a given brain region and task (e.g. seizure detection, motor imagery, emotion classification, mental tasks, sleep monitoring) [2].

The main principle for detection and classification of imaginary motor activity in brain-computer interfaces is based on an observation that the real and imaginary motions involve similar neural activity of the brain [26]. It was determined that main phenomenon is alpha wave power decrease in a motor cortex in a hemisphere contra-lateral to the movement side [25, 26, 33], usually registered by C3 and C4 channels [38, 41, 51]. It is related to phenomena of event-related de-synchronization (ERD) [21, 30, 53]. Motion intent can be also classified by linear discriminant analysis (LDA) [21, 25, 26, 31]. A recent presented application of k-means clustering and Principal Component Analysis (PCA) for steering of a simple robot [28] with a mental binary trigger, tested on 6 users. BCI was also applied in a computer game scenario with biofeedback and classification based on Regularized Fisher’s Discriminant (RFD) [22].

Every EEG recording is contaminated with artefacts originating from muscular activity of eye movements (e.g. involuntary saccades), eye blinks, and heartbeat. Numerous methods are implemented in this domain to detect and to filter such signals, as they are harmful for brain signals quality, and they overlap the useful signals in the frequency domain. Thus a frequency filtration cannot be applied, hence other approaches were developed to solve the problem. The common procedure is Signal-Space Projection (SSP) [21, 43, 46], involving spatial decomposition of the EEG signals for determining samples contaminated by the artefact. This method is based on the fact that the artefact signal repeatedly originates from the same location, e.g. from eye muscles, whereas all electrodes record it in the same manner every time, with distinct amplitudes and phase shifts. Thus, a particular pattern can be determined and removed from each recording. Similar results are achieved by the Independent Component Analysis (ICA) method [20, 21, 27, 45].

Meanwhile, the research approach presented in this paper assumes a simple parametrization of original signals, and a classification based on the rough set paradigm, without the necessity of applying any complex pre-processing routines.

3 Dataset

For the experiment an EEG Motor Movement/Imagery Dataset was used [15]. EEG recordings contained in this dataset were created by the authors of the BCI2000 instrumentation system [3, 37] and then they were made available on PhysioNet, providing a forum dedicated to dissemination and exchange of biomedical signal recordings and open-source physiological signals analysis software [15].

The dataset contains recordings obtained from 106 volunteers. Subjects were instructed to perform trigger-dependent real or imaginary movement tasks, while their brain activity was recorded with 64-channel BCI2000 system [37]. Electrodes were located complying the international 10–20 system (Fig. 1).

Fig. 1
figure 1

Names, channel numbers and placement of electrodes. Marked region denotes electrodes used for motion classification – capturing motor cortex activity [15]

Performed by the employed subjects 14 tasks were as follows:

  1. 1.

    one minute baseline (“rest” state, denoted as event T0) with eyes open,

  2. 2.

    one minute baseline (“rest” state, denoted as event T0) with eyes closed,

  3. 3.

    two-minute performance of resting and then and opening and closing left or right fist, accordingly to a location of a target object presented on the computer screen, denoted as event T1 and event T2, respectively (later on referred to as classification scenario “A”),

  4. 4.

    as above, but imaginary motion was performed instead (classification scenario “B”),

  5. 5.

    two-minute performance of resting and then opening and closing fists or moving both feet, accordingly to a location of a target object presented on the top or on the bottom of the computer screen, denoted as events T1 and T2 respectively (classification scenario “C”),

  6. 6.

    as above, but imaginary motion was performed instead (classification scenario “D”),

  7. 7–14.

    next steps consist of three repetitions of the 3–6 cycle, namely: step 7 same as 3, step 8 same as 4, etc.

Recorded 64 EEG signals were sampled at 160 Sa/s, then they were stored in the EDF+ format with an annotation channel containing timestamps with beginning and end related to T0, T1, and T2 events.

In the presented research, for the purpose of monitoring of the motor cortex activity, only respective 21 channels were considered for the further processing: FCi, Ci, CPi (Fig. 1).

4 Data processing

Dataset was imported to and then pre-processed employing the Brainstorm software, where segmentation and filtration of signals were performed [42]. Finally, numerous features were extracted.

4.1 Segmentation

The EDF+ format stores EEG data from all electrodes as well as from the description channel, with time spans of triggering events, thus allowing for segmentation of data into sections (so called epochs) containing 120 s long recordings of performed tasks: 4.2 s (672 samples) “rests” (T0) followed by a 4.1 s-long (657 samples) event T1 or T2 selected randomly. This resulted in 15 segments for T0, 8 segments for T1, and 7 segments for T2. All three repetitions of the given task were treated as one task, thus tripling the number of cases in the particular class. Summarizing: for each person, there are 45 T0 epochs, 24 T1 epochs, and 21 T2 epochs registered by 21 electrodes located over motor cortex, resulting in 1890 recorded signals, which are in later steps divided into training and testing sets for the rough set classifier.

4.2 Processing in the time-frequency domain

Every signal is decomposed into the time-frequency domain (TF): it is split into frequency bands, following the standard EEG ranges: delta (2–4 Hz), theta (4–7 Hz), alpha (8–15 Hz), beta (15–29 Hz), and gamma (30–59 Hz). The next step is extraction of filtered signals envelopes using Hilbert transform [29], being the indication of the overall activity in the particular frequency band.

4.3 Features extraction

The author proposes a parametrization of envelopes of band-filtered signals. 5 frequency subbands for each of 21 sensors, are parametrized as follows:

  1. 1.

    For a particular subband j = {delta, …, gamma} from a sensor k = {FC1, FC2, …, CP6}, 5 activity features are extracted, reflecting the activity in the particular brain region: the sum of squared samples of the signal envelope (1), mean (2), variance (3), minimum (4), and maximum of signal envelope values (5).

  2. 2.

    For all 9 pairs of symmetrically positioned electrodes kL and kR (e.g. kL = C1, and kR = C2) the signal envelopes differences are calculated and summed up (6), supposedly reflecting asymmetry in hemispheres activities while performing unilateral motion:

    $$ {\mathrm{SqSumj}}_{j, k}={\sum}_{i=1}^N{\left({e}_{j, k}\left[ i\right]\right)}^2 $$
    (1)
    $$ {Mean}_{j, k}=\frac{1}{N}{\sum}_{i=1}^N\left({e}_{j, k}\left[ i\right]\right) $$
    (2)
    $$ {Var}_{j, k}=\frac{1}{N}{\sum}_{i=1}^N{\left({e}_{j, k}\left[ i\right]-{Mean}_{j, k}\right)}^2 $$
    (3)
    $$ {Min}_{j, k}= \min \left({e}_{j, k}\left[ i\right]\right) $$
    (4)
    $$ {\mathrm{Max}}_{j, k}= \max \left({e}_{j, k}\left[ i\right]\right) $$
    (5)
    $$ {\mathrm{SumDiff}}_{j, kL, kR}={\sum}_{i=1}^N\left({e}_{j, kL}\left[ i\right]-{e}_{j, kR}\left[ i\right]\right) $$
    (6)

where, e j, k [i] is an envelope of the signal from particular subband j and electrode k.

As a result there are 615 features extracted for every epoch. The result decision table includes also task number, person number and decision (T0, T1 or T2).

This multidimensional problem of classifying EEG signal is not straightforward, because personal biological and neurological features significantly influence values of registered signals and extracted features thereof. For a pilot study of personal differences two features were selected: Meantheta,FCZ and Mingamma,FCZ, with their values plotted for three persons as in Fig. 2. The first person results exhibit a good separation of left and right motion classes, contrarily to two other with values in different ranges, different variations, and overlapping classes, thus, revealing the Meantheta,FCZ to be applicable for a person S053 but not for others.

Fig. 2
figure 2

Pilot study of attributes Meantheta,FCZ(horizontal axis) and MinFCZ,gamma (vertical axis) for L/R classes separation (dots and crosses accordingly): a person S053 – good separation, wide range of values, b person S022 – poor separation, wide range of values, c person S021 – poor separation, narrow range of values

In the following data classification (Section 5) every person is treated separately, thus for every task a new classifier is created with a different subset of useful and informative features.

5 Data classification procedure

Data classification was performed in R programming environment [13] with RoughSets package [36]. It is a mathematical calculation environment similar to MATLAB, offering data importing, scripted processing, and visualization, extensible by numerous additional libraries and packages.

Rough set theory was created by Polish mathematician Zdzisław Pawlak [32]. It is used to approximate a set by its upper and lower approximations: the first including objects that may belong to the set, and the latter including objects that surely belong to the set. Both approximations are expressed as unions of so called atomic sets containing indiscernible objects with the same values of attributes (Fig. 3).

Fig. 3
figure 3

Partition of the universe based on attributes a 1 and a 2 into atomic sets, and approximation of the decision set X d

Two objects x and y characterized by attributes P ⊆ A (P is a subset of all possible attributes in A) are in the indiscernibility relation if: (x, y)∈IND(P), where IND(P) is the equivalence relation defined as (7):

$$ \mathrm{IND}\left(\mathbf{P}\right)=\left\{\left(\mathrm{x},\mathrm{y}\right)\in {\mathbf{U}}^2|\forall a\in \mathbf{P}, a(x)= a(y)\right\}, $$
(7)

where a(x) is a value of attribute a of object x. In this work P is a set of 615 features introduced in Section 4.3, and objects x are particular epochs (EEG signals) from a given person performing a given task.

All objects in relation with x produce an equivalence class [x] P . If P contains attributes sufficient for distinguishing between objects with different decision, then the class [x] P contains only objects with the same decision as x – a lack of distinction between objects inside equivalence class is not harmful for classification accuracy. Thus, the considered set of attributes P generates a partitioning of the universe of discourse U into atomic sets – building blocks for representing rough sets, e.g. decision classes.

A set of all objects with decision d = {T0, T1, T2}, is denoted as X d . Then X d can be approximated by lower approximation P X d (8):

$$ \underset{\bar{\mkern6mu}}{\mathrm{P}}{\mathrm{X}}_d=\left\{ x|\ {\left[ x\right]}_{\mathbf{P}}\subseteq {\mathbf{X}}_d\right\}, $$
(8)

namely, a set of all objects x whose equivalence classes [x] P are included within the decision class of interest X d . It can be interpreted as a set of objects whose attributes values allow for precise classification.

On the other hand the set of objects \( \overline{P} \) X d is called upper approximation and is defined as:

$$ \overline{P}{\mathbf{X}}_d=\left\{ x|\ \left({\left[ x\right]}_{\mathbf{P}}\cap {\mathbf{X}}_d\right)\ne \varnothing \right\}, $$
(9)

whereas it includes all objects whose equivalence class has non-empty intersection with the decision class X d . It can be interpreted as a set of objects whose attributes values point to objects with the decision of X d but some equivalent object(s) can have other decision as well (Fig. 3).

The given subset of attributes P can be sufficient enough to generate such a partitioning of the universe of xU that decision classes are correctly approximated. The accuracy of rough set approximation is expressed as:

$$ {\alpha}_P\left({X}_d\right)=\frac{\left|\underset{\_}{P}{X}_d\right|}{\left|\overline{P}{X}_d\right|} $$
(10)

and α P (X d )∈[0,1], where α P (X d ) = 1, is for a precisely defined crisp set.

Application of the rough sets theory in a decision systems often requires a minimal (the shortest) subset of attributes RED ⊆ P, called reduct, resulting in the same quality of approximation as P. Numerous algorithms for calculating reducts are available, and for this work a greedy heuristic algorithm is applied [19].

The dataset is divided randomly into training and testing set, with a ratio of 65/35 chosen arbitrarily after some pilot experiments. These sets contain 1228 and 662 signals for a single person performing a particular task.

Usually, for attributes with continuous values, prior to reduct calculation a discretization is performed. Maximum discernibility (MD) algorithm is applied, which analyses attribute domain, sorts values present in the training set, takes all midpoints between values and finally returns the midpoint maximizing the number of correctly separated objects of different classes. It is repeated for every attribute. Discretization limits the number of possible values – for attributes in this study there are 1 to 2 cuts, splitting the values into 2 to 3 discrete ranges, accordingly.

Once the reduct is obtained and attributes useful for particular classification task are known, all cases in the training set are analysed, and decision rules are generated. Each object’s x i attributes a nRED are treated as an implication’s antecedent, and the decision as its consequent. Rules in the form of logic sentences are obtained (11):

$$ \mathrm{IF}{a}_1\left({x}_{\mathrm{i}}\right)={v}_1\mathrm{AND}\dots \mathrm{AND}{a}_{\mathrm{n}}\left({x}_{\mathrm{i}}\right)={v}_{\mathrm{n}}\mathrm{THEN} d\left({x}_{\mathrm{i}}\right)={d}_{\mathrm{i}}. $$
(11)

At the classification phase these rules are applied for every object in the testing set, and then the decision is predicted, to be compared with the actual one.

As it was explained in Section 4, there is a need to treat each person and each task separately, because personal characteristics differ significantly each from other. That was proven by training a single classifier for L/R movement for all 106 persons, which resulted in the classification accuracy of 0.5, equal to a random assignment to the class. Therefore, a new classifier is trained for each person and each task.

With regards to the rough set methodology presented above, the classifier is created and applied in following steps:

  1. 1.

    Data importing by selecting recordings of particular person performing given task (classification scenarios A, B, C, and D introduced in Section 3).

  2. 2.

    Selecting subsets of objects x i for 12 classification scenarios based on type of the decision:

    1. a.

      Aall, Ball, Call, Dall – for classification of all 3 events classes: X T0, X T1, X T2;

    2. b.

      Amotion, Bmotion, Cmotion, Dmotion – for discerning between “rest” (objects x i with decision T0, constituting a decision set X rest = {x i: d(x i) = T0}) and “motion” (T1 and T2 combined into one decision set X motion = {x i: d(x i) = T0 ∨ d(x i) = T1});

    3. c.

      ALR, BLR, CUD, DUD – for discerning between left/right and up/down motion only (only T1 and T2 classes in sets X L,X R, X U, X D).

  3. 3.

    Attributes a n(x i) discretization by MD algorithm described above.

  4. 4.

    Splitting data randomly in proportions 65/35 into training and testing sets, as explained earlier.

  5. 5.

    Deriving a reduct RED ⊆ P based on attributes a n of objects x i from the training set.

  6. 6.

    Calculating rules by using the reduct RED (11) and actual decisions from training set.

  7. 7.

    Classifying testing set by applying rules from previous step.

  8. 8.

    Employing cross-validation by repeating 20 times steps 4–7.

This process is performed for all 106 persons for 12 classification scenarios, and 20 cross-validation runs, resulting in 25,440 iterations.

In the described research three variants of parameters sets P were examined:

  1. 1.

    P 615 with all 615 features.

  2. 2.

    P 50 with parameters being most frequently used in classification rules from the first variant. Reducts from all iterations of given classification scenarios were analyzed for frequency of parameters and top 50 were used instead of 615 to repeat this experiment (see: Appendix). Thus it is verified if limited number of parameters is sufficient for accurate description of classes differences.

  3. 3.

    P C3C4 with 120 parameters obtained only from signals from electrodes C3 and C4, as these were reported by other research to be the most significant for motion classification [25, 33, 41], verifying if limiting the region of interest to two regions on motor cortex decreases accuracy.

6 Classification results

Aforementioned three variants of P were used in rough set classifiers, reducts RED were calculated, rules derived and applied to testing sets. Accuracies in 20 cross-validations were run for all 106 persons, then they were collected and shown in Fig. 4.

Fig. 4
figure 4

Accuracy for 12 classification tasks and 3 parameters sets: white – P 615 with 615 parameters, light gray – P 50 with top 50 parameters, dark gray – P C3C4 with parameters from electrodes C3 and C4

Obtained classification accuracies were grouped into quartiles and plotted as box-whiskers Tukey plots [44], where boxes top and bottom denote first and third quartile, the thick line inside the box represents the median (second quartile), ends of whiskers represent values within the 1.5 inter-quartile range (IQR), i.e. IQR = Q3-Q1, lower whisker is Q1–1.5·IQR, and upper is Q3 + 1.5·IQR. Values outside range of whiskers are treated as outliers marked with circles.

It can be observed that applying P 615 to classification (Fig. 4, white boxes) generally brings the best results. Limiting the parameters set to P 50 or P C3C4 (Fig. 4, gray boxes) results in decrease of accuracy of ca. 0.1, without significant difference between these two. In numerous cases P C3C4 is able to yield accuracy slightly higher than P 50.

Accuracy equal to 1.0 is rarely achieved in classification of three classes (scenarios ABCDall). Classification scenarios ABCDmotion, aimed at distinction between rest and any type of motion, result in significantly higher accuracy than for 3-classes cases. Finally, results for scenarios ABCDLR are the highest, and accuracy often equals to 1.0. Thus in real applications it is recommended to implement two classification stages: first recognizing presence of movement, and then determining either it is left or right and up or down.

Real motion (scenarios A and C) is slightly easier to classify than imagined one (scenarios B and D) with any type of proposed parameters sets. Than can be justified by inability to perform this strictly mental task in a reproducible manner, and by participants’ fatigue.

There is no significant difference in accuracy for left/right (scenarios A and B) comparing to up/down motions (scenarios C and D), beside the case of scenario CUD (real up/down motion) with highest accuracy for P 615.

The decision rules generated by rough set classifiers involve reduct RED containing various number of parameters from P set. The classification of three classes requires longest rules (Fig. 5a) with 6 or 7 parameters. In motion detection (Fig. 5b) rules are shorter, and the distinction between left and right motion requires the shortest rules. For P 50 subset the number of longer rules is larger than for P 615 as the classification requires more detailed description of every object.

Fig. 5
figure 5

Rule lengths for classification: a 3 classes by P 615, b rest and motion by P 615, c left and right and up and down by P 615, d) 3 classes by P 50, e) rest and motion by P 50, f) left and right and up and down by P 50

It is not possible to define a universal set of parameters for all participants, as every parameter appears in rules infrequently. The top parameters can be found in 3–5% of rules (Fig. 6) often matching a single person.

Fig. 6
figure 6

Percent of rules including top parameters for: a P 615 set, b P 50 set

7 Conclusions

The methodology of signal pre-processing, parametrization, feature selection and creating a rough set-based classifier for recognition of real and imagined motion from EEG signals were presented. For each person the training and classification process must be repeated, because each case differs with electrodes placements, signal registration conditions, hair and skin characteristics, varying level of stress and fatigue, varying manner of performing the imaginary motion, etc.

It can be observed that the task of accurate classification of motion is easier for real motion than for imagined one. It can be justified by the fact that actual motion performance is manifested in more consistent manner in the brain activity, compared to a strictly mental activity of imagining the motion without any helpful feedback (visual or sensory).

The total accuracy for classification scenarios ALR (real motion) is 0.87 and for BLR (imaginary motion) is 0.88, which provides an improvement over other published research, e.g. accuracy of 0.7 obtained in a left/right navigation employing comparison of signals from electrodes C3 and C4 [41]; accuracy up to 0.77 in a virtual walking task, achieved by extracting band powers of signals from C3, CZ, C4, and classifying with LDA [25]; classification of motion imagery with accuracy up to 0.85 on the same set of electrodes, but additionally employing author’s algorithm of iterative Relief based on distance from center for feature selection, and applying classification with SVM [38]; and accuracy up to 0.86 in a task of left/right hand movement, employing feature selection with ANN and genetic algorithms [51]. Moreover, the results were achieved without any dedicated statistical methods employment, such as ICA, SSP described in literature, whereas blink and heartbeat artefacts elimination and signal improvements methods were not required. The presented method can be employed in a simple, yet practical system for motion classification by EEG signals analysis. It opens a way to creating multimedia applications controlled by 5 states: rest, left, right, up, and down motion intent. Then, the navigation would be limited to for example changing the category (left, right), selecting a subcategory or confirming an option (down), and going back in a hierarchy or cancelling the option (up). A correct classification of the resting state provides the capability to work in the asynchronous mode, where the user is allowed for switching between inaction and actions at any moment.

The future work will focus on verifying this approach on other EEG sensor setups. Employing a low number of sparsely positioned electrodes.