Welcome to the repository of Delta, an efficient and effective data enrichment framework designed for on-device continual learning (CL).
The code repository is currently undergoing updates as we are re-organizing the code for improved clarity.
- Repository Structure & Description
- Requirements
- Datasets & Preparation
- Run Commands
- Acknowledgments and Note
├──root/Experiments/MobiCom24_Delta # Root path of the repository
├──Agents # Files for different continual learning algorithms
├──base.py # Abstract class for algorithms
├──fskd.py # Few-shot CL with knowledge distillation
├──fspr.py # Few-shot CL with parameter freezing
├──fsro.py # Few-shot CL with robust optimization
├──fed_cl.py # Federated CL algorithm
├──test.py # CL with data enrichment algorithms (vanilla, random and delta)
├──delta_class.py # Implementation of delta operations (device-side softmatching and cloud-side sampling)
├──Buffer # Files related to buffer management
├──buffer.py
├──name_match.py # Name-to-function mappings
├──random_retrieve.py # Random retrieval methods
├──reservoir_update.py # Random update methods
├──Data # Files for create the data stream objects of different datasets
├──RawData # Raw data files and corresponding preprocessing scripts for each task
├──cifar-10-C # CIFAR-10-C dataset
├──har # HHAR, UCI, Motion, Shoaib datasets
├──1.preprocess.py # Preprocessing script
├──textclassification # XGLUE dataset
├──1.preprocess.py # Preprocessing script
├──CloudData # Cloud-side data for each task, including public raw data and processed directory dataset
├──cifar-10-C
├──har
├──textclassification
├──cloud.py # File for cloud-side operations to generate directory dataset
├──continumm.py # Data stream object creation
├──name_match.py # Name-to-function mappings
├──utils.py
├──Experiment # Files for running specified agents (algorithms) multiple times
├──run.py
├──Figures # Directory for saving figures of experimental results
├──Log # Directory for saving final and intermediate results during experiments
├──Models # Files for backbone models and corresponding pre-training process on cloud server
├──Pretrain # Directory for saving pre-trained model weights
├──HAR_model.py # DCNN model for HAR task (requires pretraining)
├──resnet.py # ResNet model for image tasks (uses pretrained weights from PyTorch)
├──speechmodel.py # VGG model for audio tasks (uses pretrained weights from PyTorch)
├──pretrain.py # Methods for loading models with pretrained weights
├──2-cloud_pretrain.py # Cloud-side model pretraining
├──3-cloud_preprocess.py # Cloud-side data processing
├──4-main.py # On-device continual learning with specified configurations
├──5-plot.py # Script for plotting the experimental results
Ensure you have the following dependencies installed:
All raw data files can be downloaded from Google Drive. Alternatively, you can download each dataset from the following sources:
- Image Classification: CIFAR-10-C
- Human Activity Recognition: HHAR, UCI, Motion, Shoaib
- Text CLassification: Microsoft XGLUE
- Audio Recognition: Google Speech Commands
- Preprocess Raw Data: Run
1-preprocess.py
for each dataset repository inData/RawData/
. - Cloud-side Pretraining: Execute
2-cloud_pretrain.py
to pretrain models on cloud server. (Note that ResNet and Transformers can directly load pre-trained weights provided by PyTorch) - Cloud-side Data Processing: Run
3-cloud_preprocess.py
to process public data on the cloud server and generate directory weights/cluster centers. - On-Device Continual Learning: Run
4-main.py
for each task with specific commands and configurations provided inScripts/run_main.sh
, and save results inLog/
. - Plot Experimental Results: Run
5-plot.py
to output and visualize the experimental results, and save figures inFigures/
.
Our code is built upon the repositories of online-continual-learning and Miro. We extend our sincere gratitude to their foundational work.
If you have any problems, please feel free to contact us.