Ph.D. Public Defense
Spatio-Temporal Deep Learning for Mental State Recognition in Brain-Computer Interfaces
Shadi Sartipi
Supervised by Mujdat Cetin
Friday, August 16, 2024
2 p.m.3 p.m.
601 Computer Studies Building
Brain-computer interfaces (BCIs) link brain activity with external devices, aiding mental (including cognitive and emotional) state detection, and helping patients with limited muscle movement. Physiological signals from wearable devices include EEG, ECG, EMG, blood pressure, and galvanic skin response. Multichannel EEG records cortical neural activity, preserving spectral and rhythmic characteristics, and offers higher temporal resolution compared to other non-invasive methods. This makes EEG practical for cognitive and affective tasks, despite challenges like low signal-to-noise ratio (SNR) and poor spatial resolution. Current EEG-based BCIs face obstacles, including reliance on manually derived features, susceptibility to artifacts, and extensive data requirements for each new subject, making calibration time-consuming and user-unfriendly.
In this thesis, we tackle several mental state recognition problems, involving emotions, motor imagery, and sleep. We develop spatio-temporal deep learning architectures to solve these problems. First, we have designed a hybrid architecture consisting of spatio-temporal encoding and recurrent attention network blocks for emotion recognition. A preprocessing step is applied to the raw data using graph signal processing tools to perform graph smoothing in the spatial domain. We demonstrate that our pro- posed architecture not only exceeds state-of-the-art results but also has transferable model parameters for emotion classification. Next, focusing on performance degradation in subject-independent schemes, we propose two different domain adaptation approaches with a Transformer-based feature generator, namely, adversarial discriminative domain adaptation (ADDA) and multi-source domain adaptation (MSDA). This feature generator retains convolutional layers to capture shallow spatial, temporal, and spectral EEG data representations, while self-attention mechanisms extract global dependencies within these features. We demonstrate that ADDA and MSDA improve the model’s performance by minimizing the discrepancy between the target domain and all source domains. To address model generalization and robustness facing perturbations, we introduce a two-sided perturbation to learn a robust BCI model against attacks, re- inforcing the model’s resilience against adversarial attacks.
We then tackle labeled data insufficiency alongside the heterogeneity across sub- jects by proposing a semi-supervised deep architecture consisting of two parts: an un- supervised and a supervised element. First, the unsupervised part of the model, known as the columnar spatiotemporal auto-encoder (CST-AE), extracts latent features. Second, a supervised part learns a classifier using the latent features acquired in the unsupervised part. Additionally, we employ center loss in the supervised part to minimize the embedding space distance of each point in a class to its center. Our results demonstrate that the proposed architecture with limited labeled samples can reach almost the same performance as when all labels of the training set are available.
Finally, we turn to the sleep state analysis problem and propose novel deep learning approaches, namely, LG-Sleep and multi-signal deep domain adaptation, to prove the adaptability of our methods to other time-series classification tasks. We illustrate how integrating multiple modalities with deep architectures is effective in mice sleep scoring.