A tool that converts a slide deck into a video, complete with your voice narration. Support multiple languages.
Tested on Ubuntu 20.04.
- Install
ffmpeg
:sudo apt-get install ffmpeg
- Install Python (>=3.9 and <=3.11) and
pip
if you haven't already. - Clone and Install this Tool:
git clone git@github.com:Changochen/slide-to-video.git cd slide-to-video pip install .
- Verify Installation:
slide-to-video
- Slide Deck: Create a slide deck in PDF format.
- Script: Prepare a script file in plain text format, with slides separated by the marker
NEWSLIDE
. - Audio File or Model: Record an audio file of your voice in MP3 format for voice cloning. If you use paid services like Play.ht, you should have a voice model available.
slide-to-video --model MODEL_NAME --slide slide --script script --output-dir OUTPUT_PATH --config ADDITIONAL_CONFIG.yaml
To use a local voice model:
slide-to-video --model local --slide example/slide.pdf --script example/script.txt --voice example/sample.mp3 --output-dir output
A final video will be generated in the OUTPUT_PATH
directory as output.mp4
.
output.mp4
For more options, including adjusting speech speed, run:
slide-to-video --help
Currently Supported Model:
Currently Supported Languages: 'en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar', 'zh-cn', 'hu', 'ko', 'ja', 'hi'
After generating the video, the output directory will contain a project.yaml
file, which helps skip the generation of unchanged content. If inputs remain the same, the tool skips the video generation process.
If you modify the slide, script, or settings (like speech speed), the tool regenerates the affected content. To force regeneration of specific parts, set the force_reset
field of the corresponding item in project.yaml
in the output directory.
To support a new voice model, you need to implement a new class in src/slide_to_video/tts_engine
and register the class by calling register_engine
(See an example at here).
- On the first run, you might see the following prompt:
Simply enter
> You must confirm the following: | > "I have purchased a commercial license from Coqui: licensing@coqui.ai" | > "Otherwise, I agree to the terms of the non-commercial CPML: https://coqui.ai/cpml" - [y/n] | | >
y
.