FitHuBERT

This repository is for supplementing the paper, "FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning", INTERSPEECH 2022.

Distillation

Download the model checkpoint to perform knowledge distillation (e.g. HuBERT Base):
Download the LibriSpeech dataset.

Modify the configuration file in /data/conf/. The configuration file fithubert.yaml contains all the settings for reproducing FitHuBERT. Set the path to the teacher model checkpoint at teacher_model, and the root path to the LibriSpeech dataset at libri_root.

Then, run the following command:

python train.py --config ./data/conf/fithubert.yaml

After training, the model checkpoints and the corresponding configuration file will be created at /results/pretrain/.

Using the model for downstream tasks

Download and install the S3PRL toolkit.
Copy the fithubert folder into s3prl/upstream/.
Run the following the command to use the FitHuBERT model for automatic speech recognition(ASR).

python run_downstream.py -m train -n FitHuBERT-ASR -u fithubert -d asr -s last_hidden_state -k <path to .ckpt file> -g <path to .yaml file>

Refer to the SUPERB docs for more information on usage details and data preparation.

Checkpoint

For our checkpoints, check below links!

- FitHuBERT-100h

Checkpoint & yaml

- FitHuBERT-960h

Checkpoint & yaml

- FitW2V2-960h

Checkpoint & yaml

Citation

To cite our paper:

@article{lee2022fithubert,
  title={FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning},
  author={Lee, Yeonghyeon and Jang, Kangwook and Goo, Jahyun and Jung, Youngmoon and Kim, Hoirin},
  journal={arXiv preprint arXiv:2207.00555},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
data		data
fithubert		fithubert
modules		modules
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
dict.ltr.txt		dict.ltr.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FitHuBERT

Distillation

Using the model for downstream tasks

Checkpoint

Citation

About

Releases

Packages

Contributors 3

Languages

License

glory20h/FitHuBERT

Folders and files

Latest commit

History

Repository files navigation

FitHuBERT

Distillation

Using the model for downstream tasks

Checkpoint

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages