This repository contains the notebooks used to generate the pre-trained classifier of YASA sleep staging module. The main GitHub repository of YASA can be found here.
For more details on the algorithm and its validation, please refer to the preprint article.
The datasets can be found on sleepdata.org. You need to request data access to download the datasets. Specifically, training of the sleep staging classifier is done using the following datasets: CCSHS, CFS, CHAT, HomePAP, MESA, MrOS, SHHS.
If you have questions, please contact Dr. Raphael Vallat (raphaelvallat@berkeley.edu).
To reproduce all the results in the preprint article, you need to run the following scripts/notebooks in order:
00_randomize_train_test.ipynb
: randomize NSRR nights to training / testing set. This assumes that you have previously downloaded the NSRR data on your computer (up to 4 TB).feature_extraction/01_features_*.ipynb
: calculate the features for all the training nights.02_create_classifiers.ipynb
: train and export the sleep staging classifier.predict/03_predict_*.ipynb
: apply the algorithm on testing nights.04_validation_*.ipynb
: evaluate performance on the testing sets (testing set 1 = NSRR, testing set 2 = DREEM).05_SHAP_importance.py
: calculate the SHAP features importance on the NSRR training set06_nsrr_demographics.ipynb
: compare the demographics and health data of the NSRR training / testing set.
In addition, the scripts in the gridsearch
folder perform parameter searchs with cross-validation to find the best hyper-parameters, class weights and temporal smoothing windows.