Spatio-temporal modeling of EEG features for understanding

Spatio-temporal modeling of EEG
features for understanding
working memory
Jinbo Bi
Joint work with Tingyang Xu, Chi-Ming Chen, Jason Johannesen
University of Connecticut
Yale University
1/19
Outline






The main technical idea
EEG data analysis problem
The proposed approach – GEE + regularization
Our algorithm
Preliminary experimental results
Summary
2/19
Main idea

Variables are observed/measured at different
locations and different time points
Temporal line
t1 t2 t3
….
td
L1
L2
L2
:
:
Ln
The features
3/19
Main idea

If we build a linear model using all features, the coefficients in
the model form another matrix, we want it to have sparsity
patterns
Temporal line
t1 t2 t3
….
td
t1 t2 t3
L1
L2
L2
L1
L2
L2
:
:
:
:
Ln
Ln
The features X
….
td
The coefficient matrix W
4/19
Main idea

The idea is to decompose the W matrix into a summation of
two matrices of the same dimension, and then impose
different sparsity-inducing regularizers.
t1 t2 t3
….
L1
L2
L2
=
:
:
….
t1 t2 t3
td
L1
L2
L2
W
t1 t2 t3
….
td
L1
L2
L2
+:
:
:
:
Ln
Ln
td
Ln
U
V
5/19
Main idea

For instance, the widely-used L1,2 matrix norm computes the
summation of the L2 norm of individual row vectors in a
matrix, and enforces the row sparsity of a matrix
t1 t2 t3
….
td
t1 t2 t3
L1
L2
L2
L1
L2
L2
:
:
:
:
Ln
Ln
U
….
V
td
6/19
EEG data analysis problem

EEG recording provides a powerful method to
study neural dynamics of human cognition (e.g.,
working memory)
EEG recording Montage
An illustration of a BCI program
7/19
EEG data analysis problem

Stenberg tests
Baseline
Encoding
Retention
Retrieval
A sample trial of Sternberg experiment depicting stages of
information processing. Time Courses are extracted for EEG
analysis based on memory span of 4 letters
The outcome is if a person responded correctly (-1 incorrect).
8/19
EEG data analysis problem

Our data
Baseline
Encoding
Retention
Retrieval
Amplitudes of EEG in 5 frequency bands:
delta, theta, alpha, beta, and gamma


Fz
Cz
Oz
 37 schizophrenia, 6 healthy controls
 Each individual has 90 trials of Stenberg
 in each of the 3 sessions
9/19
The proposed approach


Our method combines the generalized estimating
equation and the proposed regularizer
Generalized estimating equations is a set of
methods that expand the generalized linear models,
but estimate both expectation and the covariance of
the outcome
The parameters are W and α
10/19
The proposed approach



The parameters W and α are estimated by
minimizing the so-called deviance function, i.e.,
Deviance(W,α) – the difference between the
likelihood of observing the actual y and the
likelihood of observing the mean
The deviance function is not explicit for an arbitrary
distribution, but its gradient can be computed for
the exponential families
We propose
11/19
Our FISTA-based algorithm


We solve our problem using FISTA – fast iterative
shrinkage thresholding algorithm
We solve alternatively between (U,V) and α


We use a FISTA algorithm to solve for (U,V) which is
an alternating proximal gradient method that solves
U and V alternatively using proximal operators
We use the original GEE updating formula to update
α because when U and V are fixed, the proposed
formulation is exactly same as GEE formulation when
W is fixed
12/19
Our FISTA-based algorithm


The algorithm globally converges to an optimal
solution of the problem with a convergence rate of
quadratic order
Under some regularity conditions, optimizing the
proposed formula yields an asymptotically
consistent and normally distributed estimator b̂
where
13/19
Preliminary experimental results





We preliminarily tested this algorithm on EEG
feature analysis – to predict if a person answers the
Stenberg test correctly (0) and incorrectly (1) based
on the EEG features
37 schizophrenia and 6 healthy controls, separate
classifiers for schizophrenia and health controls
After data cleaning, each patient on average has 83
trials and incorrect answer rate is 27.2%
Each health control on average has 87 trials, and
incorrect answer rate is 14.7%
Using multiple three-fold cross validation to tune
parameters λ’s
14/19
Preliminary experimental results



We first compared with the classic GEE method
We report the area under the ROC curves (AUCs)
Our method outperformed GEE consistently in four
different kinds of covariance assumptions
15/19
Preliminary experimental results

We demonstrate the selected features and stages in
the classifiers
Schizophrenia: Rows are features; columns are stages of information
processing
16/19
Preliminary experimental results

We demonstrate the selected features and stages in
the classifiers
Healthy controls: Rows are features; columns are stages of
information processing
17/19
Summary





We used a new learning formulation to select EEG
features along the temporal and spatial
dimensions
This new method also simultaneously models the
sample correlation via the GEE
A new accelerated gradient descent algorithm can
efficiently solve the related optimization problem
Preliminary results show that the EEG features
selected between SZ and HC are rather different
Future work …
18/19
References
Chen et al, Gaba level, gamma oscillation, and working
memory performance in schizophrenia, NeuroImage: Clinical,
4:531-539, 2014.
 Beck et al, A fast iterative shrinkage-thresholding algorithm
for linear inverse problems, SIAM Journal on Imaging Science,
2(1):183-202, 2009.
 Xu et al, Longitudinal LASSO: jointly learning features and
temporal contingency for outcome prediction, to appear in
ACM International Conference on Knowledge Discovery and
Data Mining (SIGKDD), 2015.

http://www.labhealthinfo.uconn.edu/
Thank you!!
19/19