IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 52, NO. 4, AUGUST 2022 677
Modified Spatio-Temporal Matched Filtering for
Brain Responses Classification
Marian P. Kotas , Michal Piela , and Sonia H. Contreras-Ortiz
Abstract—In this article, we apply themethod of spatio-temporal amplitude, therefore, considered as a suitable measure of the
filtering (STF) to electroencephalographic (EEG) data processing brain activity during visual tasks [1]–[8]. The idea to employ this
for brain responses classification. The method operates similarly potential as a control signal in BCI was first introduced in 1988
to linear discriminant analysis (LDA) but contrary to most applied
classifiers, it uses the whole recorded EEG signal as a source of by Farwell and Donchin [1]. Since then, the major challenge
information instead of the precisely selected brain responses, only. remains the same and involves very low energy of the desired
This way it avoids the limitations of LDA and improves the classi- EP in reference to the spontaneous activity of the brain. Conse-
fication accuracy. We emphasize the significance of the STF learn- quently, its proper classification requires multiple presentation
ing phase. To preclude the negative influence of super–Gaussian of the same stimulus and averaging of the consecutive responses
artifacts on accomplishment of this phase, we apply the discrete
cosine transform (DCT) basedmethod for their rejection. Later, we of the brain. This in turn prolongs the time of an experiment and
estimate thenoise covariancematrixusingall dataavailable, andwe thereby a period of high concentration a user needs to maintain. It
improve the STF template construction. The further modifications is inconvenient, particularly for people whose medical condition
are related with the constructed filters operation and consist in affects their ability to stay focused long on the mental or visual
the changes of the STF interpretation rules. Consequently, a new task. Therefore, an effort is made to shorten this time by reducing
tool for evoked potentials (EPs) classification has been developed.
Applied to the analysis of signals stored in a publicly available a number of averaging steps necessary to recognize an EP.
database, prepared for the assessment ofmodern algorithms aimed Development of new, more effective classifiers is one of the
in EPs detection (in the frames of the 2019 IFMBE Scientific ways to achieve this objective. The aim of the classifiers is to
Challenge), it allowed to achieve the second best result, very close discriminate between two types of brain responses: target which
to the best one, and significantly better than the ones achieved by are triggered by the stimuli, the user is focused on, and nontarget,
other contestants of the challenge.
i.e., caused by other stimuli, the user is trying to neglect. Popular
Index Terms—Brain–computer interfaces (BCI), discrete cosine algorithms that assure good performance are based on support
transform (DCT), generalized matched filtering (GMF), spatio– vector machines [9], [10], ensemble of weighted support vector
temporal filtering (STF), visual evoked potentials (EPs). machines [11], [12], linear discriminant analysis (LDA) [13],
I. I stepwise LDA [14], shrinkage LDA [15], Fisher’s and BayesianNTRODUCTION
LDA (FLDA, BLDA) [16], [17], sparse Bayesian LDA [18], and
CAPABILITY for quick and reliable classification of brain regularized group sparse LDA [19].responses plays a key role in designing modern and effi- While the above listed methods perform an outright classifi-
cient brain-computer interfaces (BCI). Responses of the brain cation of the brain responses stored in signal matrices, there are
in BCI are triggered by external (visual, auditory) or internal also methods, which precede the classification by prior signal
(mental simulation of a physical action) stimuli, therefore, are enhancement and feature extraction using spatial (the xDAWN
called as evoked potentials (EP). Among a group of brain’s re- algorithm [20]) or spatial and temporal projections [21]. In [22],
sponses to a visual stimulus, P300 is characterized by the greatest a two-step projection using spatial LDA and temporal canonical
correlation analysis was proposed to form feature vectors for
Manuscript received July 9, 2021; revised January 29, 2022 and March 18, later classification; a further extension of this approach was
2022; accepted March 31, 2022. Date of publication May 5, 2022; date of
current version July 14, 2022. This work was supported in part by statutory described in [23].
funds of the Department of Cybernetics, Nanotechnology and Data Processing, Although the linear methods appeared to be advantageous in
Silesian University of Technology, BK-2022 and co-financed by European Union application to EP detection [24], there are attempts to utilize also
through the European Social Fund under Grant POWR.03.02.00-00-I029. This
work was performed using the infrastructure supported by POIG.02.03.01-24- the nonlinear ones, like e.g., convolutional neural networks [25]–
099/13 Grant: GeCONiI-Upper Silesian Center for Computational Science and [27], self-organizing fuzzy neural networks [28], or extreme
Engineering. This article was recommended by Associate Editor T. H Falk. learning machines [29], [30]. Particularly fast development can
(Corresponding author: Marian P. Kotas.)
Marian P. Kotas and Michal Piela are with the Department of Cybernetics, be observed in applications of convolutional neural networks.
Nanotechnology and Data Processing, Silesian University of Technology, 44- In [25], the first convolutional layer performs spatial filtering
101 Gliwice, Poland (e-mail: mkotas@polsl.pl; michal.piela@polsl.pl). and the second, the temporal one; later, the fully connected
Sonia H. Contreras-Ortiz is with the Universidad Tecnológica de Bolívar,
Biomedical Engineering Program, 150003 Cartagena, Colombia (e-mail: scon- multilayer perceptron is employed for final classification. The
treras@utb.edu.co). authors achieved good results using an ensemble of such clas-
Color versions of one or more figures in this article are available at sifiers (called as a multiclassifier system). In [31], the order of
https://doi.org/10.1109/THMS.2022.3168421.
Digital Object Identifier 10.1109/THMS.2022.3168421 spatial and temporal filtering was reversed. This architecture
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
678 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 52, NO. 4, AUGUST 2022
was also adopted in [32]. Similarly, in [33] and [34], where
the convolutional neural network was preceded by additional
preprocessing steps to improve the classification results.
Spatio-temporal filtering (STF) is an intrinsically linear
method, applied, e.g.: to detection of low amplitude signals of
repeatable morphology embedded in high energy noise [35], [36]
or to motor imagery (MI) tasks classification [37]–[40], or to re-
lated operations like selection of the proper EEG channels [41],
or extraction of movement related potentials [42] for later MI
classification. The alternating spatial–temporal optimization and
projection was applied to event related potentials classification
in [43].
We have applied STF to tackle the fundamental problem Fig. 1. Illustration of BCI experiment; a moment when visual stimulus (target)
is captured.
of LDA based approaches to EP classification, i.e., the sin-
gularity of the within-class covariance matrix which has to be
inverted [16]. This singularity results usually from a large dimen-
sion of a feature vector (it is equal to the product of a number of accomplished by comparing the outputs of the classifier applied
time samples covering a brain response and a number of channels to these six events. The one with the highest score indicates
recorded) and a limited number of the brain responses used the picture classified as a target. However, because of a very
for the classifiers learning. Since the within-class dispersion low amplitude of the brain responses to the visual stimuli, it
results mostly from the presence of noise in the feature vectors, is very rare to achieve proper discrimination at this step of the
this problem can be seen as an issue of the noise properties experiment.
estimation. Accomplishment of this operation on the basis of Therefore, this part of the experiment, so called run (of
target and nontarget responses, only, seems to be unjustified. We six flashes), when each visual stimulus occurs only once, is
propose to reject this restriction and perform the noise covari- insufficient. Multiple repetition of such runs of flashes is re-
ance matrix estimation using different time segments of EEG quired. In each consecutive run, the order of pictures flash-
signals, not necessarily those corresponding to the precisely ing is random, independent from the other runs. This way
selected brain responses. Moreover, taking into account that STF of presentation involves an element of surprise, necessary to
is most effective in the colored Gaussian noise (CGN) [44], we elicit an EP. For every individual picture, the outputs obtained
propose to reject super–Gaussian artifacts for better estimation in the consecutive runs are averaged. Typically, taking into
of the CGN properties. This improves the EP classifiers learning account more subsequent runs, the averaged response to a
without the necessity of inversion regularization [16], or feature target stimulus is increased when compared to the averaged
vectors dimension reduction [20], [21]. responses to the nontarget ones. Consequently, increasing the
A further improvement can be achieved if we face the fact number of runs analyzed should increase the classification
that brain responses can have different time delays with respect accuracy.
to the stimuli (flashes) and it is not always justified to apply the The described series of runs, related with one particular target
classifiers to the same time segments after the stimuli. object, is called as a block. During the experiments, this process
The rest of this article is organized as follows. Section II (block) is usually repeated for every object (on the screen)
explains the operation of a BCI system. Section III provides regarded as a target. All blocks executed form a session. In the
detailed description of the original STF method; its modifica- considered example, a session contains six blocks.
tions are presented in Section IV. Results of experiments and In the experimental section, we will analyze the dataset
their discussion are presented in Section V. Finally, Section VI BCIAUT-P300, recorded during similar experiments. It was cre-
concludes this article. ated for a purpose of the 2019 IFMBE Scientific Challenge [32].
The dataset contains recordings of 15 subjects diagnosed with
autism spectrum disorder. The structure of the data is as follows:
II. EVOKED POTENTIALS (EPS) BASED BCI there are seven learning and seven testing sessions recorded by
During EPs based BCI experiments, a subject faces a screen every subject. Each learning session contains 20 blocks. During
filled with different objects (letters, signs or pictures, etc.) and each of the blocks, the user is focused on one picture (the one
is focused on the selected one. Fig. 1 shows an example where selected as a target, out of eight available on the user interface).
the user interface contains six household appliances, and the Every block is built of 10 runs, each of which gathers single brain
user focuses on the lamp. The six objects are flashed one at responses to all possible events (one target and seven nontargets).
a time in a random order. The flash of the item the user is The testing subset is organized in a similar manner; however,
focused on (lamp) is considered as the target event; flashes of the number of runs per block varies between 4 and 10. The EEG
the other items are the nontarget ones. We expect the target signals recorded are sampled with the frequency of 250 Hz,
to elicit higher responses of the brain than the nontargets. notch-filtered at 50 Hz, and bandpass-filtered between 2 and
Once all the objects have been flashed (one target and five 30 Hz. Signals were acquired using eight-channel configuration
nontargets), we can compare the elicited brain responses. It is (electrodes were positioned at C3, Cz, C4, CPz, P3, Pz, P4, POz).
KOTAS et al.: MSTMF FOR BRAIN RESPONSES CLASSIFICATION 679
III. SPATIO-TEMPORAL FILTERING time. Therefore, we create an auxiliary setΨt(Δ)with argument
Lets denote the m channel signal vector as x(k) = Δ > δ. On this basis, we create the set corresponding to spatio–
[x1(k)x2(k), . . . , x (k)]
T
m . To exploit the spatial and temporal
temporal vectors that should be suppressed
relationships between the signal components, we have defined, Ψsupp(Δ) = Ψ−Ψt(Δ) (4)
similarly as in [35], the following vector representation of the
multichannel signals: ⎡ ⎤ where Ψ denotes the set of all time indices for which vectorsx(k) have been constructed. Thus Ψsupp(Δ) indicates all vectors
⎢ x1(k − J · τ)⎢ ⎥⎥ x
(k) whose time location is rather distant from the vectors
⎢⎢x1(k − (J − 1)τ)⎢ ⎥⎥⎥
maximized.
⎢ . . . ⎥ Finally, we create the follo∑wing objective function:⎢⎢x (k + (J − 1)τ)⎥⎥ 1 T (k) 2⎢ 1 ⎥ |Ψt(δ)| ∑k∈Ψ (δ)(h x )t⎢ x (k + J · τ) ⎥ Q(h) = 1 (5)1 T (k) 2
x(k) = ⎢⎢ ⎥⎥ |Ψsupp(Δ)| k∈Ψsupp(Δ)(h x )⎢ . . . ⎥ (1)⎢⎢ − · ⎥⎥ where | · | denotes the set cardinality.⎢ xm(k J τ)⎢ − − ⎥⎥ We gather the spatio–temporal vectors that are maximized in⎢xm(k (J 1)τ)⎢ ⎥⎥ matrix Apxn = [a1,a2, . . . ,an ], and those which we wanta a⎣⎢ . . . ⎦⎥ to suppress in matrix Bpxn = [b ,b , . . . ,bb 1 2 n ], where n =b axm(k + (J − 1)τ) |Ψt(δ)| and nb = |Ψsupp(Δ)|. This way we obtain the following
x (k + J · τ) function:m
1 T 2
containing 2J + 1 time samples from m channels available (the ‖h A‖ hTCahQ(h) = na = (6)
x(k) vector length is p = (2J + 1)m). 1 ‖hTB‖2 hTC hn bb
Parameter τ is introduced to perform a kind of signal re-
1 T 1 T
sampling (from the original frequency fs to fs/τ , applied in where Ca = n AA and Cb =a n BB .b
spatio-temporal vectors). We assume that this new frequency Function Q(h) has the form of a Rayleigh quotient whose
should not be smaller than 40 Hz, thus parameter τ can be maximization can be achieved using the generalized eigende-
calculated according to: τ =  fs[Hz]40 , where · denotes the composition
largest integer not greater than the argument. Cah = λCbh. (7)
The filtering operation can be expressed as
T (k) The solution, we are interested in, is the generalized eigenvectory(k) = h x (2) e1 that corresponds to the greatest eigenvalue. Here, we face the
and if we want to perform detection of some signal pattern of a problem of the eigenvector sign ambiguity [45]. We solve it by
finite length Tp, the time interval 2 · J · τ should not be⌈shorte⌉r. using either h = e1 or h = −e1, so as the average response of
Thus, parameter J can be calculated according to: fs·TJ = p , the filter to the learning vectors stored in matrix A was positive.2·τ
· This original version of the STF method [based on eigende-where denotes the smallest integer not smaller than the composition (7)] will further simply be denoted as OSTF.
argument.
B. STF Interpretation
A. STF Construction
Lets denote the onset of the oth object ith flash as oki. The
Let’s denote the moments when the target object is flashed as STF filter response to this flash is expected at oki + J · τ (see
ki, i = 1, 2, . . . , It (It denotes the total number of these flashes). the explanation in Section III-A). It is therefore sufficient to
We assume that the brain response to the ith flash begins at
· · calculate and store for further analysis the valueski and ends at ki + 2 J τ . This time segment corresponds
to spatio-temporal vector x(ki+J ·τ). For this vector, the filter q = y(oi,o ki + J · τ); o = 1, 2, . . . , Io; i = 1, 2, . . . , Ir (8)
should produce the maximal output value. Since the slope of the
where Io denotes the number of objects on the screen, and I isoutput signal is limited, more of its neighboring time samples r
· · · · − · the number of analyzed runs of flashes.will be magnified ( , y(ki + J τ 1), y(ki + J τ), y(ki +
J · τ + 1), · · · Thus, the oth column of the formed Q matrix contains). Thus, to construct the filter we first specify the IrxIo
the scores obtained for successive flashes of the oth object. If
following set indicating the spatio–temporal vectors that should
we want to select the target object on the basis of one run of
be maximized
flashes only (Ir = 1), we simply choose o for which q1,o has
Ψt(δ) = {k| |k − (ki + J · τ)| ≤ δ · fs, i = 1, 2, . . . , It} the greatest value. However, we usually take into account more
(3) runs (Ir > 1), and we select the column for which the mean of
where δ denotes the assumed time range. the corresponding Ir values is highest.
We want the filter to be very sensitive to the corresponding The described interpretation rules share an important feature
set of vectors (x(k), k ∈ Ψt(δ)). On the other hand, we want with most classifiers applied to EP discrimination. They select
it to produce as low output as possible in other moments of the same located segments of the EEG signal following the
680 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 52, NO. 4, AUGUST 2022
stimuli (contained in vectors x(ki+J ·τ), i = 1, 2, · · · ) to perform 1) Detection of Outlying Artifacts: The operation is per-
classification. Therefore, they will further be referred to as the formed in each channel separately. Its first step involves cal-
classical decision rules (CDR). culation of the 5th and 95th percentile of all values in the
channel. Then, multiplied by d these percentiles are regarded as
C. Averaging of the Maximized Spatio–Temporal Vectors thresholds that limit the negative and positive values accepted,
respectively. In the experimental section, we use d = 3 (during
In [36], we propose to calculate the average of the vectors initial experiments, for d ∈ [2, 5], we have observed similar
stored in A
∑ performance of EP classification). In each place, where then1 a thresholds are crossed, we search for the signal maximum (or
ā = ai. (9) minimum), and starting from this position, we search for the
na i=1 artifact onset and offset. The criterion is based on the signal
Objective function (6) can now be replaced by derivative: in order to find the first and the last sample of the
artifact, we search for the positions (to the left and to the right
hT (āāT )h from the extremum, respectively), where the derivative polarity
Q(h) = (10)
hTCbh changes. All samples that belong to the found artifacts are
marked as outlying; for signal x(k) recorded during a single
and consequently, the eigendecomposition (7) based formula is block, we form matrix Z containing the markers, which classify
substituted by the product the samples of x({k) as outlying (0) or proper (1)h = C−1b ā (11) 0, if xj(i) is outlying
zi,j = . (14)
which is computationally identical to the one used for con- 1, if xj(i) is proper
struction of the highly acknowledged generalized matched fil-
ters [44]. Therefore, this solution will further be referred to Simultaneously, the xj(i) values marked as outlying are consid-
as the generalized spatio–temporal matched filtering (GSTMF). ered as missing and are replaced with zeros.
GSTMF and OSTF will be regarded as reference methods for 2) Missing Values Interpolation: To this end, we apply the
the assessment of the modified spatio–temporal matched filter iterative interpolation procedure, developed in [47] and facili-
(MSTMF), which we propose in this article. tated as MATLAB function inpaintn by Garcia. The procedure is
based on the discrete cosine transform (DCT) based smoothing
IV. M S –T proposed in [48]. The values of signal x(k) are stored in matrixODIFIED PATIO EMPORAL MATCHED FILTERING
X (subsequent channels in subsequent columns of X). Then, Z
Although the brain responses to the nontarget flashes are defined by (14) and X are submitted to the inpaintn procedure.
very tiny, in [16], it was shown that taking into account these On the basis of the output matrix X̂ (ŷ{k+1} defined in [47, eq.
responses during classifiers learning (and not only the target (20)], where {k + 1} denotes the number of iteration), we obtain
responses) improves discrimination between them. To make use the signal with outlying artifacts suppressed x̂(k).
of this finding, we construct setΨnt(δ) using (3) withki replaced
by k′i, i = 1, 2, . . . , Int (corresponding to the moments when the B. Modification of the Decision Rules
nontarget objects are flashed). On this basis, we form matrix
A′ that contains vectors beginning close to these nontarget The CDR, described in Section III-B, are less effective if the
flashes {x(k+Jτ)| k ∈ Ψ (δ)}. By averaging these vectors, we brain responses to the flashes are of changing delays. To over-nt
construct the average nontarget response ā′. come this inconvenience, we propose to calculate the average
Now, objective function Q(h) takes the following form: MSTMF responses to the Ir flashes of each object
T − ′ − ′ T i∑=I i=Ih (ā ā )(ā ā ) h r ∑ro
Q(h) = (12) ȳo(k) = h
T x( ki+J ·τ+k) = y(oki + J · τ + k).
hTCbh i=1 i=1
and its solution is given by (15)
The maximum of this average response to the oth object is
h = C−1b (ā− ā′). (13) expected at k = 0. However, we search for its true position
around this point
The EEG signals can contain outlying super–Gaussian ar-
tifacts, which can have a detrimental influence on the filter ko = argmax|k|≤r ȳo(k) (16)s
template construction. Because of this, we precede the learning
phase of the proposed method with the procedure for outlying where variable rs limiting the range of the search can be es-
artifacts rejection. tablished using rs = Ts · fs (following the preliminary experi-
ments, Ts = 0.1 s has been applied).
The value ȳo(ko) is taken as the classifier score for the oth
A. Rejection of Outlying Artifacts (ROA) object. The object that achieves the highest value is regarded
The procedure consists of the operation of outlying artifacts as the target one. This approach to STF interpretation will be
detection and of the stage of the detected artifacts suppression. referred to as the modified decision rules (MDR).
KOTAS et al.: MSTMF FOR BRAIN RESPONSES CLASSIFICATION 681
Fig. 2. Results of outlying artifacts suppression: original 64-channel EEG Fig. 3. Results of outlying artifacts suppression: the modified 64-channel EEG
recording; in bold black: the channels with artifacts detected, in gray: the recording; in bold black: the modified channels, in gray: the unchanged channels.
channels accepted. The amplitude scale has been changed with respect to Fig. 2.
C. MSTMF Responses as Feature Vectors
To exploit not only the height of the STF responses to the
respective objects, but also their shape, we take the values:
ȳo(ko + k), k = −rs,−rs + 1, . . . , rs, as the features of the oth
object [rs is the parameter used in (16)]. Those feature vectors
can undergo classification for better discrimination between the
target and nontarget brain responses. Since the amplitudes of
the signals at the MSTMF output can vary substantially, we nor-
malize them by dividing by their standard deviation (SD). This
operation is accomplished during each individual classification
task (for which the normalizing SD value is calculated on the
basis of the signals recorded during all Ir flashes of the respective
objects). In the experimental section, the SVM with linear kernel
function will be applied for classification. The obtained combi-
nation of methods will simply be denoted as MSTMF+SVM (as
opposed to MSTMF+CDR and MSTMF+MDR).
Fig. 4. LDA classifier (subplots A,B) versus MSTMF filter (C,D). Vectors are
constructed according to (1); vertical dotted lines mark segments of 2J + 1 =
V. R D 43 values, corresponding to individual EEG channels (whose names are given inESULTS AND ISCUSSION subplot A). From the top: A) the difference between the centers of the target and
A. Signals Preprocessing nontarget classes, B) coefficients of the LDA classifier, C) the difference ā− ā
′
(calculated to determine MSTMF) and D) the obtained MSTMF template; a.u.
Both in the learning phase and during the STFs operation, the stands for arbitrary units. .
processed EEG signals undergo prior band-pass filtering [32].
ROA, which is rather time consuming, is applied during the
learning phase, only. Figs. 2 and 3 show exemplary results B. Construction of the Modified Spatio–Temporal Matched
of this operation. Fig. 2 contains an original 64-channel EEG Filters
recording with the noise artifacts; channels detected as noisy We have set parameters τ and J to 6 and 21, respectively,
are marked in bold black. Exactly the same time frame of the according to the formulas given below (1) and (2), so that the
signal is included in Fig. 3; here, the artifacts are significantly filter length (2 · J · τ/fs) covered approximately 1 s period.
reduced (in bold black are the plots of the channels modified). Analysis of such a segment is oriented toward detection of the
In dataset BCIAUT-P300, used in this study, the signals are not whole group of EPs, not only P300.
stored in a continuous way (where all the channels and time For each subject, we apply all learning data to calculate
samples are accessible). Instead, for each stimulus (a flash of the mean target (ā) and nontarget (ā′) brain responses, and
a single object on the screen) a signal segment of 1.4 s length the covariance matrix C . The exemplary difference ā− ā′b is
is written down: beginning 200 ms before the flash onset. Such presented in Fig. 4.C and the final MSTMF filter, in subplot D.
length is insufficient for outlying artifacts rejection. To make For reference, we have plotted the results of using LDA (in A
this operation possible, all time segments corresponding to the and B).
flashes of 8 objects during 10 runs of flashes are first conflated. Subplots A and C are very similar, with the plot in C being
After this operation, we perform outlying artifacts rejection. slightly more smooth (as it is explained in Appendix A, for
682 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 52, NO. 4, AUGUST 2022
TABLE I dataset used, Δ has hardly any influence on the classification
ACCURACY (± STANDARD ERROR OF MEAN (SEM)) OF TARGET STIMULI
D U M P , B LDA, results. Thus, a fixed value of Δ = 0.1 s has been applied. ForETECTION SING THE ETHODS ROPOSED THE AYESIAN AND THE
METHODS PARTICIPATING IN 2019 IFMBE SCIENTIFIC CHALLENGE (FOR THEIR parameter δ, we performed a kind of cross validation. The filters
PRECISE DESCRIPTION SEE [32]) were constructed on the basis of six learning sessions and applied
to the seventh one. After rotation of this validation session and
averaging of the results, an estimate of the accuracy was obtained
for each value of δ. Following, we could select the one for
which the best validation accuracy has been achieved. This value
has been used in the final tests. BLDA has been applied using
different decimation factors [16], and the best results achieved
are presented. In the lower part of the table, for reference, the
results of 2019 IFMBE Scientific Challenge are provided.
Watching the table, we can notice that using the generalized
eigendecomposition (7) based OSTF with the CDR gives rather
poor results. The large standard error of mean: SEM=8.01,
shows that the method could have been fairly effective for
some subjects and must have failed for some other ones. When
formula (11) is applied in the learning phase, the filter (GSTMF)
performs much better, competitively to BLDA and to most
a kind of low-pass filtering is introduced). However, the sophisticated methods competing in the challenge reported.δ > 0
final templates constructed are quite different; although we can The high accuracy of such a relatively simple method must be
see some similarity of the plots in B and D, the latter is of much perceived as an outcome of the proper estimation of the noise
better quality. It seems to be well matched to the local properties covariance matrix Cb: on the basis of the whole information
of the brain responses, whereas the former is embedded in wide available, contained in the recorded EEG signals. This has
band contaminations, and therefore can be expected to be more resulted in a fairly effective filter but the further modifications
susceptible to noise. This substantial difference is primarily of a mean nontarget brain response (ā
′) subtraction, and DCT
caused by more effective estimation of the noise covariance based outlying artifacts rejection, applied by MSTMF, have still
matrix (if compared to the within class scatter matrix of raised its performance.Cb
the LDA classifier). However, it is application of the MDR that has resulted in
This can be explained in the following way: is estimated most significant improvement of the classification results. TheCb
on the basis of spatio–temporal vectors (k) indicated by set reasons of such an outcome are illustrated in Fig. 5. Presented arex
Ψsupp(Δ). As it has been expressed below its definition (4), the
the MSTMF responses to a block of nine runs of flashes (each run
set indicates vectors whose time location is rather distant from consisting of one target and seven nontarget flashes). Individual
the vectors maximized (corresponding to target flashes). Thus, filter responses to the successive nine target flashes are presented
almost all vectors corresponding to nontarget flashes can be used in subplot A. Below in subplot B are the filter responses to nine
for estimation. As it has already been mentioned, the signals flashes of a single nontarget object; we have selected the oneCb
we use for experiments consist of 1.4 s long segments, corre- (among seven) that has produced the highest average nontarget
sponding to all individual flashes of the respective objects. With MSTMF response. Both in A and B, the plots seem to be
the sampling frequency fs = 250 Hz and the assumed values of
dominated by high energy responses to the spontaneous EEG.
and , applying (1) we construct 98 spatio-temporal vectors A correct classification on the basis of such single responsesJ τ
for each of such signal segments. All such vectors corresponding does not seem possible. It is their averaging that is necessary
to nontarget flashes are used to estimate and also a part of to achieve the goal, and the average of plots from A is drawnCb
vectors corresponding to the target ones (their accepted number with blue color in subplot C. We can observe a high amplitude
depends on the assumed value of parameter ). By contrast, peak, whose dominant component is the MSTMF response toΔ
only 1 vector corresponding to each individual flash (both target the desired EP. What is surprising, however, is that the peak
and nontarget) is used to estimate the within class scatter matrix maximum is significantly shifted to the right from k = 0. Thus,
in LDA. application of CDR would cause the classification error: the
height at k = 0 of the blue target response (marked by the red
cross) is lower than the height of the corresponding nontarget
C. Brain Responses Classification one in D. Only the search for the maximum of the filter response
We have applied all the methods described: OSTF, GSTMF, leads to the correct classification because the height of the green
and MSTMF, and as a particular reference: the BLDA [16], i.e., point in C is greater than the corresponding one in D. Thus, in
the method that uses Bayesian regularization to overcome the spite of the rather great height of the average target response, it is
LDA limitations (the software has been provided by the authors). only application of MDR that allowed to avoid the classification
The results are presented in the upper part of Table I. error.
For each of our methods, we had to select the values of This observation is confirmed by the increased classification
parameters δ and Δ. It has, however, appeared that for the accuracy of MSTMF+MDR. It has risen up to above 90%. Using
KOTAS et al.: MSTMF FOR BRAIN RESPONSES CLASSIFICATION 683
TABLE III
COMPUTATIONAL TIMES (± STD)
inadequacy of using the eigendecomposition based approach for
so noisy signals.
Comparing MSTMF (applied with the MDR or with the SVM
classifier) with the reference method (BLDA), we can notice sta-
tistically significant improvement of the classification accuracy.
This seems surprising because BLDA uses advanced Bayesian
regularization for estimation of the classifier template. However,
the need to apply regularization results from the abovementioned
limitations of the LDA based approaches, and although the
Bayesian regularization can effectively solve the problem of
singularity of the scatter matrix to be inverted, it does not solve
the problem of changing delays of the brain responses.
Thus, the most significant improvement of MSTMF accuracy
is related with the modification of the decision rules. Averaging
Fig. 5. Operation of the decision rules (MDR versus CDR): A) individual of the MSTMF responses to individual flashed objects and the
MSTMF responses to target flashes during a block of the successive Ir = 9
runs of flashes [variable k on the horizontal axis is introduced by (15)], B) the search for maxima of such average responses, applied in MDR,
MSTMF responses to Ir flashes of a single nontarget object, C) the average have allowed to raise the classification accuracy to above 90%,
target MSTMF responses (the one in blue is the mean of the plots in A; the and have made the MSTMF method not only significantly better
black ones correspond to different blocks of flashes), D) the average nontarget
MSTMF response (the mean of the plots in B). The double-sided arrow shows than the reference BLDA, but even comparable to the winner of
the region of the search for the maxima of the MSTMF responses. The red 2019 challenge, i.e., to the convolutional neural network based
crosses mark the values of the scores obtained for k = 0 (according to CDR), on the EEGnet software [31]. Moreover, compared to the second
and the green points, the ones obtained using MDR (the maxima in the specified
range of the search). The additional comments are in the text. best competitor of the challenge, MSTMF appeared significantly
more effective. This justifies using the modifications proposed,
TABLE II which are aimed to extract all the information from the whole
STATISTICAL SIGNIFICANCE (S-SIGNIFICANT, N-NONSIGNIFICANT) OF THE EEG signal recorded.
DIFFERENCES BETWEEN THE METHODS TESTED: LETTER S MEANS THAT THE The introduced modifications inevitably increase the compu-
P-VALUE < 0.001 tational costs of the methods. To illustrate this, in Table III, we
have provided the times necessary to accomplish the learning
stage and the operation stage of the selected versions of the STF
method. Learning and testing times shown in Table III are the
subject-averaged execution times. It should be noted that for
each subject the total length of the learning signals is approx-
MSTMF responses as feature vectors and the SVM classifier has imately equal to 37.3 min, and the length of the test signals is
allowed to improve the results only slightly. Nevertheless, the around 56 min. As far as the learning stage is realized, we can
increase of accuracy to nearly 91% is worth noticing. notice rather great differences between the respective methods.
To verify statistical significance of the differences between However, the operation stage is much much faster, and should
the methods tested, we have compared them pairwise, denoting not preclude any of the studied versions of the STF method from
as the number of the cases classified correctly by the first, practical applications. For every method mentioned in Table III,f12
and incorrectly by the second method compared, and as the the analysis of a 1.4 s long segment of the signal (related to onef21
number of opposite ones. On this basis, we have calculated the event) is executed in around 3 ms time.
squared McNeymar statistic [49] One of the fundamental questions related with the develop-
ment of real working BCI systems is: how many EEG channels
(f − f )2 are required to achieve satisfactory results. To give some insight12 21
z = (17)
f + f into the matter, we have performed the following experiment.12 21
For MSTMF, we decreased the number of leads available (us-
which obeys the Chi-square distribution with one degree of ing the same cross validation as for selection of parameter δ)
freedom [50]. and calculated the classification accuracy achievable. Fig. 6.A
The results obtained are presented in Table II. For all methods presents the results obtained. It reveals that it is advantageous to
tested, we have confirmed statistical significance of improve- measure at least six channels, since for five of them the accuracy
ment with respect to the original OSTF method. This confirms discernibly drops down. The another question is: how dense
684 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 52, NO. 4, AUGUST 2022
and the second, in the proper interpretation of the filtering results.
To meet the first requirement, we have rejected the limitation
of using only target and nontarget brain responses to estimate
the noise covariance matrix. Instead, we have used all the data
satisfying the necessary conditions, solving in this simple way,
the problem of singularity of the matrix to be inverted. Moreover,
we have proposed outlying artifacts rejection before the filter
construction, what has still improved its performance. Applied
with the CDR, the developed MSTMF filter has allowed to obtain
relatively high accuracy of EP classification.
To meet the second requirement, we have addressed the
problem of changing delays of the brain responses with respect
to the visual stimuli. A delayed brain response at the MSTMF
Fig. 6. Illustration of the factors that influence the classification accuracy of
the MSTMF based BCI system: A) Accuracy as a function of the channels input inevitably results in a delayed filter response at its output.
number, B) Accuracy as a function of τ . Therefore, it appears beneficial to look for the maxima of these
responses. This search is preceded by averaging of the MSTMF
should the brain responses be resampled, in the spatio–temporal responses to the flashes of the respective objects. Since averaging
vectors constructed, to achieve good classification results. This of the target responses causes their enhancement, it helps to find
is related with the value of parameter . In Fig. 6.B, we can the correct maxima. These operations referred to as the MDRτ
notice that for growing over the value of 10, the classification have significantly increased the EP classification accuracy.τ
accuracy decreases, discernibly. The applied value of , With the very limited costs of MSTMF operation, we believeτ = 6
calculated according to the formula given below (1), seems to that our method can contribute to construction of more reliable
be a good compromise between the computational costs of the and faster brain–computer interfaces.
method and its achievable classification accuracy.
The proper estimation of the noise covariance matrix is highly APPENDIX A ON THE NOISE INFLUENCE ON STFS
important. It not only allows to construct the filter with visually CONSTRUCTION
good properties, like the one presented in Fig. 4, but also assures In the following, we will consider the filters defined by (11),
effective suppression of stationary CGN of even extremely high related directly to the acknowledged generalized matched filter-
energy. The problem can arise if the noise in test signals changes ing (GMF). The GMF allows for detection of a known signal
its properties with respect to this from the learning data. Such embedded in a CGN [44].
situation must be taken into account because we cannot assume In the problem studied, the noise component is a complicated
the noise stationarity during different sessions, and even in long mixture produced mostly by the ongoing brain activity but also
time intervals. A kind of remedy could be found if we had by the noncortical biologic sources (e.g., eye blinks and move-
rather large learning database. This would allow us to construct ments, or muscle and heart contractions) and the environmental
many filters matched to different local noise properties. Later sources (e.g., powerline, radio and electrical interference) [51].
the weighted sum of the obtained ensemble of filters could be Some of these components, like e.g., sinusoidal powerline in-
effective for the noise present in test signals. Development of terference are sub-Gaussian (with flatter distribution than Gaus-
such an ensemble of filters seems to be a promising direction for sian [52]) and some are super-Gaussian (sharper, with so called
the future study. “heavy” tails, associated with relatively high probability of large
However, we expect that it can be even more promising to amplitude peaks). This non-Gaussianity allows their separation
apply the MDR in the already developed EP classifiers. Taking using independent component analysis [52].
into account possible changes of the brain responses time de- However, the recorded and processed signals are the mixtures
lays (with respect to the stimuli) and looking for their correct of such components, and owing to the central limit theorem,
localization, corresponding to the locally highest scores of the we can make a simplifying assumption of their Gaussianity,
classifiers, should help to raise the accuracy of many existing unless one of these components has a predominant amplitude.
algorithms. Because of a strong correlation between the EEG channels and
the successive signal samples, we can assume the noise to be
VI. CONCLUSION colored.
The constructed average pattern ā can be expressed as
In this study, we have proposed application of spatio–temporal
filtering to classification of visual EPs elicited by target and n1 ∑a
nontarget stimuli. We have shown that using the whole recorded ā = (s+wi) = s+ w̄ (18)na
EEG signal as a source of information instead of the same located i=1
brain responses to those stimuli, only, significantly improves the where s is the fixed desired signal and wi, the additive noise. To
classification accuracy. achieve noise suppression, the noise vectors wi should be [53]
There are two major factors that contribute to this outcome. 1) of zero mean (E{wki} = 0, where wki is the kth entry of
The first one consists in the appropriate construction of the STF, vector wi, E denotes expectation).
KOTAS et al.: MSTMF FOR BRAIN RESPONSES CLASSIFICATION 685
2) uncorrelated with the desired signal (E{sn · wki} = 0). [11] S. Kundu and S. Ari, “P300 detection with brain–computer interface
3) uncorrelated among themselves (E{w · w } = application using PCA and ensemble of weighted SVMs,” IETE J. Res.,ki nj

 vol. 64, no. 3, pp. 406–416, 2018.0, for i = j. [12] Y-R. Lee and H-N. Kim, “A data partitioning method for increasing
It should be noted that for Gaussian signals uncorrelatedness ensemble diversity of an eSVM-based P300 speller,” Biomed. Signal
is equivalent to independence, with the latter being a stronger Process. Control, vol. 39, pp. 53–63, 2018.
[13] C. Guger et al., “How many people are able to control a P300-based brain-
condition. computer interface (BCI)?,” Neurosci. Lett., vol. 462, pp. 94–98, 2009.
Since initially all signals are band-pass filtered, the zero mean [14] D. J. Krusienski et al., “Toward enhanced P300 speller performance,” J.
assumption is naturally satisfied. Moreover, since the noise com- Neurosci. Methods, vol. 167, no. 1, pp. 15–21, Jan. 2008.
[15] B. Blankertz et al., “Single-trial analysis and classification of ERP com-
ponent is not related with the moments, the EPs are triggered, we ponents’a tutorial,” NeuroImage, vol. 56, no. 2, pp. 814–825, May 2011.
can assume the noise vectors to be independent from the desired [16] U. Hoffmann, J. Vesin, T. Ebrahimi, and K. Diserens, “An efficient
signal. When parameter δ of set Ψ [defined by (3)] equals 0, P300-based brain computer interface for disabled subjects,” J. Neurosci.,t vol. 167, no. 1, pp. 115–125, 2008.
matrixA contains a single spatio–temporal vector for each target [17] Y. Miao et al., “An ERP-based BCI with peripheral stimuli: Validation
flash. Thus, for a relatively high distances between the flashes, with ALS patients,” Cogn. Neurodynamics, vol. 14, pp. 21–33, 2020.
we can also assume the noise vectors (w ) to be independent [18] Y. Zhang, G. Zhou, J. Jin, Q. Zhao, X. Wang, and A. Cichocki, “Sparsei Bayesian classification of EEG for brain–computer interface,” IEEETrans.
from each other (and consequently uncorrelated). Neural Netw. Learn. Syst., vol. 27, no. 11, pp. 2256–2267, Nov. 2016.
Following, for δ = 0, we can expect effective suppression of [19] Q. Wu, Y. Zhang, J. Liu, J. Sun, A. Cichocki, and F. Gao, “Regularized
noise as a result of averaging. However, for δ > 0 matrix A group sparse discriminant analysis for P300-based brain–computer inter-
face,” Int. J. Neural Syst., vol. 29, no. 6, 2019, Art. no. 1950002.
contains many successive signal vectors related with the same
· − · · [20] B. Rivet, A. Souloumiac, V. Attina, and G. Gibert, “xDAWN algorithmtarget flash: {. . . ,x(k+J τ 1),x(k+J τ),x(k+J τ+1), . . .}, thus to enhance evoked potentials: Application to brain–computer interface,”
we cannot expect the corresponding w vectors to be uncor- IEEE Trans. Biomed. Eng., vol. 56, no. 8, pp. 2035–2043, Aug. 2009.i [21] F. S. Rizi, V. Abootalebi, and M. T. Sadeghi, “Spatial and spatio-temporal
related; nevertheless, their averaging will result in a kind of filtering based on common spatial patterns and Max-SNR for detec-
low-pass filtering, which can additionally decrease the level of tion of P300 component,” Biocybernetics Biomed. Eng., vol. 37, no. 3,
noise in the constructed pattern ā. pp. 365–372, 2017.
[22] X. Xiao, M. Xu, J. Jin, Y. Wang, T. -P. Jung, and D. Ming, “Discrim-
However, to avoid strong violation of the assumption concern- inative canonical pattern matching for single-trial classification of ERP
ing the fixed shape of the desired component, which is expressed components,” IEEE Trans. Biomed. Eng., vol. 67, no. 8, pp. 2266–2275,
by (18), the value of δ should not be large. Actually, it should Aug. 2020.
[23] X. Xiao et al., “Enhancement for P300-speller classification using multi-
be emphasized that the concept of using δ > 0 can only be window discriminative canonical pattern matching,” J. Neural Eng.,
applied because of the very low frequency content of the visual vol. 18, no. 4, 2021, Art. no. 046079.
EPs (which are additionally smoothed by the applied band-pass [24] D. J. Krusienski et al., “A comparison of classification techniques for the
P300 speller,” J. Neural Eng., vol. 3, no. 4, pp. 299–305, 2006.
filtering). [25] H. Cecotti and A. Graser, “Convolutional neural networks for P300 detec-
tion with application to brain-computer interfaces,” IEEE Trans. Pattern
Anal. Mach. Intell., vol. 33, no. 3, pp. 433–445, Mar. 2011.
[26] L. Vařeka, “Evaluation of convolutional neural networks using a large
REFERENCES multi-subject P300 dataset,” Biomed. Signal Process. Control, vol. 58,
2020, Art. no. 101837.
[1] L. A. Farwell and E. Donchin, “Talking off the top of your head: Toward [27] O. Y. Kwon, M. H. Lee, C. Guan, and S. W. Lee, “Subject-independent
a mental prosthesis utilizing event-related brain potentials,” Electroen- brain–computer interfaces based on deep convolutional neural networks,”
cephalogr. Clin. Neuriophysiol., vol. 70, no. 6, pp. 510–523, 1988. IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 10, pp. 3839–3852,
[2] S. Sutton, M. Braren, J. Zubin, and E. R. John, “Evoked potential correlates Oct. 2020.
of stimulus uncertainty,” Science, vol. 150, pp. 1187–1188, 1965. [28] D. Coyle, G. Prasad, and T. M. McGinnity, “Extracting features for a
[3] Q. T. Obeidat, T. A. Campbell, and J. Kong, “Introducing the edges brain-computer interface by self-organising fuzzy neural network-based
paradigm: A P300 brain–computer interface for spelling written words,” time series prediction,” in Proc. 26th Annu. Int. Conf. IEEE Eng. Med.
IEEE Trans. Human-Mach. Syst., vol. 45, no. 6, pp. 727–738, Dec. 2015. Biol. Soc., San Francisco, USA, 2004, pp. 4371–4374, 2004.
[4] S. Noorzadeh, B. Rivet, and C. Jutten, “3-D interface for the P300 speller [29] W. Kong et al., “Weighted extreme learning machine for P300 detection
BCI,” IEEE Trans. Human-Mach. Syst., vol. 50, no. 6, pp. 604–612, with application to brain computer interface,” J. Ambient Intell.Humanized
Dec. 2020. Comput., pp. 1–11, 2018.
[5] J. Qu et al., “A novel three-dimensional P300 speller based on stereo visual [30] Z. Jin et al., “EEG classification using sparse Bayesian extreme learning
stimuli,” IEEE Trans. Human-Mach. Syst., vol. 48, no. 4, pp. 392–399, machine for brain–computer interface,” Neural Comput. Appl., vol. 32,
Aug. 2018. pp. 6601–6609, 2020.
[6] P. Stegman, C. S. Crawford, M. Andujar, A. Nijholt, and J. E. Gilbert, [31] V. J. Lawhern, A. J. Solon, N. R. Waytowich, S. M. Gordon, C. P. Hung,
“Brain–computer interface software: A review and discussion,” IEEE and B. J. Lance, “EEGNet: A compact convolutional neural network for
Trans. Human-Mach. Syst., vol. 50, no. 2, pp. 101–115, Apr. 2020. EEG-based brain–computer interfaces,” J. Neural Eng., vol. 15, no. 5,
[7] A. Cruz, G. Pires, A. Lopes, C. Carona, and U. J. Nunes, “A self-paced 2018, Art. no. 056013.
BCI with a collaborative controller for highly reliable wheelchair driving: [32] M. Simões et al., “BCIAUT-P300: A multi-session and multi-subject
Experimental tests with physically disabled individuals,” IEEE Trans. benchmark dataset on autism for P300-based brain-computer-interfaces,”
Human-Mach. Syst., vol. 51, no. 2, pp. 109–119, Apr. 2021. Front. Neurosci., vol. 14, 2020, Art. no. 568104.
[8] A. Nourmohammadi, M. Jafari, and T. O. Zander, “A survey on unmanned [33] M. Liu, W. Wu, Z. Gu, Z. Yu, F. Qi, and Y. Li, “Deep learning based on
aerial vehicle remote control using brain–computer interface,” IEEETrans. batch normalization for P300 signal detection,”Neurocomputing, vol. 275,
Human-Mach. Syst., vol. 48, no. 4, pp. 337–348, Aug. 2018. pp. 288–297, 2018.
[9] A. Rakotomamonjy and V. Guigue, “BCI competition III: Dataset II – [34] F. Li, X. Li, F. Wang, D. Zhang, Y. Xia, and F. He, “A novel P300 classi-
ensemble of SVMs for BCI P300 speller,” IEEE Trans. Biomed. Eng., fication algorithm based on a principal component analysis-convolutional
vol. 55, no. 3, pp. 1147–1154, Mar. 2008. neural network,” Appl. Sci., vol. 10, no. 4, 2020, Art. no. 1546.
[10] M. Thulasidas, C. Guan, and J. Wu, “Robust classification of EEG signal [35] M. Kotas, J. Jeżewski, K. Horoba, and A. Matonia, “Application of
for brain-computer interface,” IEEE Trans. Neural Syst. Rehabil. Eng., spatio-temporal filtering to fetal electrocardiogram enhancement,” Com-
vol. 14, no. 1, pp. 24–29, Mar. 2006. put. Methods Progr. Biomed., vol. 104, pp. 1–9, 2010.
686 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 52, NO. 4, AUGUST 2022
[36] J. Giraldo-Guzmán, M. Kotas, F. Castells, S. H. Contreras-Ortiz, and [52] R. Vigario, J. Sarela, V. Jousmiki, M. Hamalainen, and E. Oja, “Indepen-
M. Urina-Triana, “Estimation of PQ distance dispersion for atrial fib- dent component approach to the analysis of EEG and MEG recordings,”
rillation detection,” Comput. Methods Progr. Biomed., vol. 208, 2021, IEEE Trans. Biomed. Eng., vol. 47, no. 5, pp. 589–593, May 2000.
Art. no. 106167. [53] E. J. Berbari, S. M. Collins, and R. Arzbaecher, “Evaluation of esophageal
[37] S. Lemm, B. Blankertz, G. Curio, and K-R. Müller, “Spatio-spectral filters electrodes for recording his-purkinje activity based upon signal variance,”
for improving the classification of single trial EEG,” IEEE Trans. Biomed. IEEE Trans. Biomed. Eng., vol. BME-33, no. 10, pp. 922–928, Oct. 1986.
Eng., vol. 52, no. 9, pp. 1541–1548, Sep. 2005.
[38] G. Dornhege, B. Blankertz, M. Krauledat, F. Losch, G. Curio, and K.
Muller, “Combined optimization of spatial and temporal filters for im-
proving brain-computer interfacing,” IEEE Trans. Biomed. Eng., vol. 53,
no. 11, pp. 2274–2281, Nov. 2006.
[39] F. Qi, Y. Li, and W. Wu, “RSTFC: A novel algorithm for spatio-temporal
filtering and classification of single-trial EEG,” IEEE Trans. Neural Netw.
Learn. Syst., vol. 26, no. 12, pp. 3070–3082, Dec. 2015.
[40] A. Jiang, J. Shang, X. Liu, Y. Tang, H. K. Kwan, and Y. Zhu, “Efficient CSP Marian P. Kotas received the M.Sc., Ph.D., and D.Sc. degrees in biomedical
algorithm with spatio-temporal filtering for motor imagery classification,” electronics from the Silesian University of Technology, Gliwice, Poland, in 1991,
IEEE Trans. Neural Syst. Rehabil. Eng., vol. 28, no. 4, pp. 1006–1016, 1996, and 2012, respectively.
Apr. 2020. He is currently an Associate Professor with the Department of Cybernetics,
[41] F. Qi et al., “Spatiotemporal-filtering-based channel selection for single- Nanotechnology and Data Processing, Silesian University of Technology. His
trial EEG classification,” IEEETrans. Cybern., vol. 51, no. 2, pp. 558–567, current research interests include linear and nonlinear filtering of biomedical
Feb. 2021. signals, multivariate data processing and pattern recognition.
[42] J. Lu, K. Xie, and D. J. McFarland, “Adaptive spatio-temporal filtering for
movement related potentials in EEG-based brain–computer interfaces,”
IEEE Trans. Neural Syst. Rehabil. Eng., vol. 22, no. 4, pp. 847–857,
Jul. 2014.
[43] Y. Zhang, G. Zhou, Q. Zhao, J. Jin, X. Wang, and A. Cichocki, “Spatial-
temporal discriminant analysis for ERP-based brain-computer interface,”
IEEE Trans. Neural Syst. Rehabil. Eng., vol. 21, no. 2, pp. 233–243,
Mar. 2013.
[44] S. M. Kay, “Fundamentals of statistical signal processing. detection the- Michal Piela received the B.S. and M.Sc. degrees in biomedical engineering
ory,” in Signal Process. Ser.. Hoboken, NJ, USA: Prentice-Hall, vol. 2, from the University of Science and Technology, Kraków, Poland, in 2010 and
1998. 2011, respectively. He is currently working toward the Ph.D. degree with the
[45] R. Bro, E. Acar, and Tamara G. Kolda, “Resolving the sign ambiguity Department of Cybernetics, Nanotechnology and Data Processing, Silesian
in the singular value decomposition,” J. Chemometrics, vol. 22, no. 2, University of Technology, Gliwice, Poland.
pp. 135–140, 2008. His current research interests include biomedical signal processing, brain-
[46] G. D. Dawson, “A summation technique for detecting small signals in a computer interfaces, and evoked potentials detection.
large irregular background,” J. Physiol., vol. 115, no. 1, 1951.
[47] D. Garcia, “Robust smoothing of gridded data in one and higher dimen-
sions with missing values,” Comput. Statist. Data Anal., vol. 54, no. 4,
pp. 1167–1178, 2010.
[48] M. J. Buckley, “Fast computation of a discretized thin-plate smoothing
spline for image data,” Biometrika, vol. 81, no. 2, pp. 247–258, 1994.
[49] G. M. Foody, “Thematic map comparison: Evaluating the statistical sig-
nificance of differences in classification accuracy,” Photogrammetric Eng.
Remote Sens., vol. 70, no. 5, pp. 627–633, 2004. Sonia H. Contreras-Ortiz received the B.S. and M.Sc. degrees in electronic
[50] A. Agresti, An Introduction to Categorical Data Analysis. Wiley, 2018. engineering from Universidad Industrial de Santander, Bucaramanga, Colom-
[51] M. N. Anastasiadou, M. Christodoulakis, E. S. Papathanasiou, S. S. bia, in 2001, M.Sc. in 2004 respectively, and the Ph.D. degree in biomedical
Papacostas, and G. D. Mitsis, “Unsupervised detection and removal of engineering from University of Connecticut, Storrs, CT, USA, in 2011.
muscle artifacts from scalp EEG recordings using canonical correlation She is currently the Director of the Biomedical Engineering program with
analysis, wavelets and random forests,” Clin. Neuriophysiol., vol. 128, Universidad Tecnológica de Bolívar, in Cartagena de Indias, Colombia. Her
no. 9, pp. 1755–1769, 2017. research interests include biosignal and medical image processing and analysis.