diff --git a/.gitignore b/.gitignore index cdfc9e2..6fb49d2 100644 --- a/.gitignore +++ b/.gitignore @@ -114,7 +114,7 @@ montreal-forced-aligner/ raw_data/ output/ *.npy -*.wav +preprocessed_data**/*.wav TextGrid/ hifigan/*.pth.tar *.out diff --git a/README.md b/README.md index 94d07f7..2c0ed7f 100644 --- a/README.md +++ b/README.md @@ -20,11 +20,11 @@ pip3 install -r requirements.txt ## Inference -You have to download the [pretrained models]() and put them in ``output/ckpt/LJSpeech/``. +You have to download the [pretrained models](https://drive.google.com/drive/folders/1Kzh3AxVl5cpVixs18-eDDPnKOsdp8Ep9?usp=sharing) and put them in ``output/ckpt/LJSpeech/``. For English single-speaker TTS, run ``` -python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step 900000 --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/model.yaml -t config/LJSpeech/train.yaml +python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step RESTORE_STEP --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/model.yaml -t config/LJSpeech/train.yaml ``` The generated utterances will be put in ``output/result/``. @@ -33,7 +33,7 @@ The generated utterances will be put in ``output/result/``. Batch inference is also supported, try ``` -python3 synthesize.py --source preprocessed_data/LJSpeech/val.txt --restore_step 900000 --mode batch -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/model.yaml -t config/LJSpeech/train.yaml +python3 synthesize.py --source preprocessed_data/LJSpeech/val.txt --restore_step RESTORE_STEP --mode batch -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/model.yaml -t config/LJSpeech/train.yaml ``` to synthesize all utterances in ``preprocessed_data/LJSpeech/val.txt`` @@ -42,7 +42,7 @@ The speaking rate of the synthesized utterances can be controlled by specifying For example, one can increase the speaking rate by 20 % by ``` -python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step 900000 --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/model.yaml -t config/LJSpeech/train.yaml --duration_control 0.8 +python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step RESTORE_STEP --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/model.yaml -t config/LJSpeech/train.yaml --duration_control 0.8 ``` # Training @@ -100,11 +100,11 @@ tensorboard --logdir output/log/LJSpeech ``` to serve TensorBoard on your localhost. - +![](./img/tensorboard_audio.png) # Implementation Issues diff --git a/demo/LJSpeech/250000/But there were few cases so remarkable as the great ones already recorded..png b/demo/LJSpeech/250000/But there were few cases so remarkable as the great ones already recorded..png new file mode 100644 index 0000000..eddb880 Binary files /dev/null and b/demo/LJSpeech/250000/But there were few cases so remarkable as the great ones already recorded..png differ diff --git a/demo/LJSpeech/250000/But there were few cases so remarkable as the great ones already recorded..wav b/demo/LJSpeech/250000/But there were few cases so remarkable as the great ones already recorded..wav new file mode 100644 index 0000000..9d1fd3a Binary files /dev/null and b/demo/LJSpeech/250000/But there were few cases so remarkable as the great ones already recorded..wav differ diff --git a/demo/LJSpeech/250000/Here are the match lineups for the Colombia Haiti match..png b/demo/LJSpeech/250000/Here are the match lineups for the Colombia Haiti match..png new file mode 100644 index 0000000..d1f38f6 Binary files /dev/null and b/demo/LJSpeech/250000/Here are the match lineups for the Colombia Haiti match..png differ diff --git a/demo/LJSpeech/250000/Here are the match lineups for the Colombia Haiti match..wav b/demo/LJSpeech/250000/Here are the match lineups for the Colombia Haiti match..wav new file mode 100644 index 0000000..4db047f Binary files /dev/null and b/demo/LJSpeech/250000/Here are the match lineups for the Colombia Haiti match..wav differ diff --git a/demo/LJSpeech/250000/In some yards.png b/demo/LJSpeech/250000/In some yards.png new file mode 100644 index 0000000..0d006a4 Binary files /dev/null and b/demo/LJSpeech/250000/In some yards.png differ diff --git a/demo/LJSpeech/250000/In some yards.wav b/demo/LJSpeech/250000/In some yards.wav new file mode 100644 index 0000000..890603c Binary files /dev/null and b/demo/LJSpeech/250000/In some yards.wav differ diff --git a/demo/LJSpeech/250000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..png b/demo/LJSpeech/250000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..png new file mode 100644 index 0000000..5b1b912 Binary files /dev/null and b/demo/LJSpeech/250000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..png differ diff --git a/demo/LJSpeech/250000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..wav b/demo/LJSpeech/250000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..wav new file mode 100644 index 0000000..7ea13da Binary files /dev/null and b/demo/LJSpeech/250000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..wav differ diff --git a/demo/LJSpeech/250000/The central criminal court, when the trial came on,.png b/demo/LJSpeech/250000/The central criminal court, when the trial came on,.png new file mode 100644 index 0000000..16ba24f Binary files /dev/null and b/demo/LJSpeech/250000/The central criminal court, when the trial came on,.png differ diff --git a/demo/LJSpeech/250000/The central criminal court, when the trial came on,.wav b/demo/LJSpeech/250000/The central criminal court, when the trial came on,.wav new file mode 100644 index 0000000..4450f2b Binary files /dev/null and b/demo/LJSpeech/250000/The central criminal court, when the trial came on,.wav differ diff --git a/demo/LJSpeech/250000/Weekends at twenty three fifty..png b/demo/LJSpeech/250000/Weekends at twenty three fifty..png new file mode 100644 index 0000000..3e68440 Binary files /dev/null and b/demo/LJSpeech/250000/Weekends at twenty three fifty..png differ diff --git a/demo/LJSpeech/250000/Weekends at twenty three fifty..wav b/demo/LJSpeech/250000/Weekends at twenty three fifty..wav new file mode 100644 index 0000000..7ae3cb8 Binary files /dev/null and b/demo/LJSpeech/250000/Weekends at twenty three fifty..wav differ diff --git a/demo/LJSpeech/250000/one of the persons who heard the sirens was johnny calvin brewer, manager of hardy's shoestore, a fe.png b/demo/LJSpeech/250000/one of the persons who heard the sirens was johnny calvin brewer, manager of hardy's shoestore, a fe.png new file mode 100644 index 0000000..315a805 Binary files /dev/null and b/demo/LJSpeech/250000/one of the persons who heard the sirens was johnny calvin brewer, manager of hardy's shoestore, a fe.png differ diff --git a/demo/LJSpeech/250000/one of the persons who heard the sirens was johnny calvin brewer, manager of hardy's shoestore, a fe.wav b/demo/LJSpeech/250000/one of the persons who heard the sirens was johnny calvin brewer, manager of hardy's shoestore, a fe.wav new file mode 100644 index 0000000..fb3142b Binary files /dev/null and b/demo/LJSpeech/250000/one of the persons who heard the sirens was johnny calvin brewer, manager of hardy's shoestore, a fe.wav differ diff --git a/demo/LJSpeech/250000/testing testing testing!.png b/demo/LJSpeech/250000/testing testing testing!.png new file mode 100644 index 0000000..09f618a Binary files /dev/null and b/demo/LJSpeech/250000/testing testing testing!.png differ diff --git a/demo/LJSpeech/250000/testing testing testing!.wav b/demo/LJSpeech/250000/testing testing testing!.wav new file mode 100644 index 0000000..c7aac58 Binary files /dev/null and b/demo/LJSpeech/250000/testing testing testing!.wav differ diff --git a/demo/LJSpeech/250000/testing testing testing.png b/demo/LJSpeech/250000/testing testing testing.png new file mode 100644 index 0000000..f8e250a Binary files /dev/null and b/demo/LJSpeech/250000/testing testing testing.png differ diff --git a/demo/LJSpeech/250000/testing testing testing.wav b/demo/LJSpeech/250000/testing testing testing.wav new file mode 100644 index 0000000..4dcf38e Binary files /dev/null and b/demo/LJSpeech/250000/testing testing testing.wav differ diff --git a/demo/LJSpeech/500000/But there were few cases so remarkable as the great ones already recorded..png b/demo/LJSpeech/500000/But there were few cases so remarkable as the great ones already recorded..png new file mode 100644 index 0000000..0a247e5 Binary files /dev/null and b/demo/LJSpeech/500000/But there were few cases so remarkable as the great ones already recorded..png differ diff --git a/demo/LJSpeech/500000/But there were few cases so remarkable as the great ones already recorded..wav b/demo/LJSpeech/500000/But there were few cases so remarkable as the great ones already recorded..wav new file mode 100644 index 0000000..ffec3fc Binary files /dev/null and b/demo/LJSpeech/500000/But there were few cases so remarkable as the great ones already recorded..wav differ diff --git a/demo/LJSpeech/500000/Here are the match lineups for the Colombia Haiti match..png b/demo/LJSpeech/500000/Here are the match lineups for the Colombia Haiti match..png new file mode 100644 index 0000000..09e0f58 Binary files /dev/null and b/demo/LJSpeech/500000/Here are the match lineups for the Colombia Haiti match..png differ diff --git a/demo/LJSpeech/500000/Here are the match lineups for the Colombia Haiti match..wav b/demo/LJSpeech/500000/Here are the match lineups for the Colombia Haiti match..wav new file mode 100644 index 0000000..e3c1b62 Binary files /dev/null and b/demo/LJSpeech/500000/Here are the match lineups for the Colombia Haiti match..wav differ diff --git a/demo/LJSpeech/500000/In some yards.png b/demo/LJSpeech/500000/In some yards.png new file mode 100644 index 0000000..f667b9d Binary files /dev/null and b/demo/LJSpeech/500000/In some yards.png differ diff --git a/demo/LJSpeech/500000/In some yards.wav b/demo/LJSpeech/500000/In some yards.wav new file mode 100644 index 0000000..f6a7322 Binary files /dev/null and b/demo/LJSpeech/500000/In some yards.wav differ diff --git a/demo/LJSpeech/500000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..png b/demo/LJSpeech/500000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..png new file mode 100644 index 0000000..cf941a6 Binary files /dev/null and b/demo/LJSpeech/500000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..png differ diff --git a/demo/LJSpeech/500000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..wav b/demo/LJSpeech/500000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..wav new file mode 100644 index 0000000..b9f4251 Binary files /dev/null and b/demo/LJSpeech/500000/On Friday night in Bridgeport expect a temperature of minus four degrees Fahrenheit..wav differ diff --git a/demo/LJSpeech/500000/The central criminal court, when the trial came on,.png b/demo/LJSpeech/500000/The central criminal court, when the trial came on,.png new file mode 100644 index 0000000..ddb458d Binary files /dev/null and b/demo/LJSpeech/500000/The central criminal court, when the trial came on,.png differ diff --git a/demo/LJSpeech/500000/The central criminal court, when the trial came on,.wav b/demo/LJSpeech/500000/The central criminal court, when the trial came on,.wav new file mode 100644 index 0000000..01ea570 Binary files /dev/null and b/demo/LJSpeech/500000/The central criminal court, when the trial came on,.wav differ diff --git a/demo/LJSpeech/500000/Weekends at twenty three fifty..png b/demo/LJSpeech/500000/Weekends at twenty three fifty..png new file mode 100644 index 0000000..25d3bfe Binary files /dev/null and b/demo/LJSpeech/500000/Weekends at twenty three fifty..png differ diff --git a/demo/LJSpeech/500000/Weekends at twenty three fifty..wav b/demo/LJSpeech/500000/Weekends at twenty three fifty..wav new file mode 100644 index 0000000..bd40cc3 Binary files /dev/null and b/demo/LJSpeech/500000/Weekends at twenty three fifty..wav differ diff --git a/img/tensorboard_audio.png b/img/tensorboard_audio.png new file mode 100644 index 0000000..ad03eab Binary files /dev/null and b/img/tensorboard_audio.png differ diff --git a/img/tensorboard_loss.png b/img/tensorboard_loss.png new file mode 100644 index 0000000..c3cab08 Binary files /dev/null and b/img/tensorboard_loss.png differ diff --git a/img/tensorboard_spec.png b/img/tensorboard_spec.png new file mode 100644 index 0000000..c80168f Binary files /dev/null and b/img/tensorboard_spec.png differ