Skip to content

Latest commit

 

History

History
21 lines (18 loc) · 774 Bytes

workplan.md

File metadata and controls

21 lines (18 loc) · 774 Bytes

Workplan

Plan

Stage 1 (MNIST)

  • Create simple, 3 layer CNN and train on MNIST dataset
  • Extend to full teacher network and train
  • Create student network and implement KD loss
  • Train student using KD on MNIST
  • Experiment with temperature effect on model predictions

Stage 2 (CIFAR-10 -- BLKD)

  • Implement ResNet-18 teacher network
  • Train on CIFAR-10 dataset
  • Create smaller, student network
  • Train in same way to MNIST using KD from teacher
  • Gather results for effect of both temperature and alpha on test accuracy

Stage 3 (Teacher-Assistant KD -- TAKD)

  • Implement Assistant network (1-step assistant)
  • Implement training loop for Teacher->Assistant->Student distillation
  • Test accuracy of TAKD (teacher-assistant KD) against BLKD (baseline KD)