My work for my Master’s thesis. Proposed a new architecture for sleep stage classification that used pre trained Vision Transformers, along with an inter epoch attention layer to capture sequeuntial information.
TLDR: We propose a new architecure “SwinTsle” consisting of a pretrained Swin-T Transformer as the backbone, along with an additional transformer layer between epochs. In order to handle class imbalance and improve on the recall of the REM class, SwinTsle is trained with the help of data augmentation techniques as well as the Dice Loss. With these improvements, SwinTsle is able to achieve a very high prediction accuracy of 96.03%, beating Spindle’s (the previous SOTA model developed in the lab) performance of 94.67% by nearly 1.5% when tested on the Spindle data set. When tested on a human data set, SwinTsle outperforms Spindle’s prediction accuracy by nearly 10%, achieving a prediction accuracy comparable to state-of-the-art models specifically built for sleep stage classification in humans. SwinTsle shows the potential of Vision Transformers and Attention for the sleep stage classification problem.
Long version: