Skip to content

AsoSoft/AsoSoft-TTS-Speech-Corpus-for-Central-Kurdish

Repository files navigation

AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech

Speech data in the Kurdish language and associated resources such as tags are among the most essential language resources needed for NLP research and applications such as speech synthesis and automatic speech recognition, etc. Speech data for Central Kurdish was created and gathered to use in speech synthesis and automatic speech recognition as part of this project. In order to create this corpus, 21 hours of speech have been recorded and transcribed. This corpus consists of a female speaker over 30 years of age with a bachelor's degree and 10,979 sentences. These sentences were recorded in the studio for four months. A subset of the AsoSoft speech corpus for TTS is available for download for research and non-commercial usage. This subset of the speech corpus can be utilized for Central Kurdish speech synthesis and other spoken language processing applications. The currently available dataset consists of about 01:02 (one hour and two minutes) of 522 "text, audio" pairings. Files: • .wav: wave file recorded in 22.05 kHz, 16bit, mono • .txt: transcription in Kurdish

About

AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published