Speech data in the Kurdish language and associated resources such as tags are among the most essential language resources needed for NLP research and applications such as speech synthesis and automatic speech recognition, etc. Speech data for Central Kurdish was created and gathered to use in speech synthesis and automatic speech recognition as part of this project. In order to create this corpus, 21 hours of speech have been recorded and transcribed. This corpus consists of a female speaker over 30 years of age with a bachelor's degree and 10,979 sentences. These sentences were recorded in the studio for four months. A subset of the AsoSoft speech corpus for TTS is available for download for research and non-commercial usage. This subset of the speech corpus can be utilized for Central Kurdish speech synthesis and other spoken language processing applications. The currently available dataset consists of about 01:02 (one hour and two minutes) of 522 "text, audio" pairings. Files: • .wav: wave file recorded in 22.05 kHz, 16bit, mono • .txt: transcription in Kurdish
-
Notifications
You must be signed in to change notification settings - Fork 3
AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech
License
AsoSoft/AsoSoft-TTS-Speech-Corpus-for-Central-Kurdish
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published