Code for paper Significance of neural phonotactic models for large-scale spoken language identification (IJCNN 2017)
This package lets you perform language identification over spoken utterances. Following steps explain how to setup the package, train the model and then test over unseen utterances. Please read the paper linked aboved for more details.
The setup requires installation of RNNLM and SRILM. Please download and install them before proceeding to install this package.
- RNNLM: untar
src/rnnlm-0.4b.tgz
and runmake
inside the extracted folder to build the RNNLM package. Update the RNNLM path insrc/train_phonotactics.sh
andsrc/test.sh
. - SRILM: Follow the instructions at given link to install.
Finally you must have have ngram-count
, ngram
and rnnlm
in your PATH.
Next we need to setup python environment. Follow the commands to setup the environment.
cd lid-convex-comb
virtualenv venv
. venv/bin/activate
pip install -r src/requirements.txt
cd src
python train.py train.txt
train.py
requires train.txt
which has the following format:
audio_file_path_1 <tab> language_tag1
audio_file_path_2 <tab> language_tag1
audio_file_path_3 <tab> language_tag1
audio_file_path_4 <tab> language_tag2
audio_file_path_5 <tab> language_tag2
audio_file_path_6 <tab> language_tag2
.
.
.
Create another file named test.txt
with similar format. After training is finished, make sure data/langs
directory has a unique directory for each language for which you trained. Now run the following commands for testing on unseen data.
cd src
python test.py test.txt
Please contact me for any issues at brijmohanlal.s [at] research [dot] iiit [dot] ac [dot] in.