Skip to content

This project shows how to train a language-recognizer from scratch that is able to distinguish between German and English, robustly.

License

Notifications You must be signed in to change notification settings

fraunhofer-iais/language-recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spoken Language Recognition Using Convolutional Neural Networks

Written by Joscha S. Rieber (Fraunhofer IAIS) in 2020

This project shows how to train a language recognizer from scratch that is able to distinguish between German and English. These notebooks build up a playground together with the data from Common Voice to build strong models. With the data, pre-processing and model the accuracy is 93.8 %.

Have a Look at my Article on Towards AI

This repository is also described in more detail in my article published by Towards AI.

Getting Started

  • On Linux:
    • Download this repository from GitHub
    • Call "bash run.sh"
      • This script will first look if the environment is ready, if not, it will download Miniconda and create the conda environment. Please note that you will need "wget" to succeed.
    • Now go through the notebooks in the right order and follow the given instructions.

System Requirements

A fast CPU is recommended for data augmentation and pre-processing. For the model training, a well-suited GPU is necessary. I have tested the scripts with an Nvidia P5000 and an Nvidia Tesla G80. The dataset coming from Mozilla Common Voice has a huge size. It might take a lot of time to process all of the data.

References

About

This project shows how to train a language-recognizer from scratch that is able to distinguish between German and English, robustly.

Resources

License

Stars

Watchers

Forks

Languages