Google Cloud speech-to-text video transcribe subtitle generator

This script converts a video file to an audio file, transcribe the audio file with Google Cloud Platform speech-to-text API, and generates the result into .SRT, .JSON, .TXT file formats.

Requirements

Git, Python 3.7 and ffmpeg installed on your system.
A Google Cloud project with billing enabled.
A service account with the right to use Speech-to-Text API.

Download the service account credentials as credentials.json. Example:

{
"type": "service_account",
"project_id": "EXAMPLE",
"private_key_id": "EXAMPLE",
"private_key": "EXAMPLE",
"client_email": "EXAMPLE",
"client_id": "EXAMPLE",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/EXAMPLE"
}

Install the requirements:
```
pip3 install -r requirements.txt
```

Confugure the .env file. Example:

# Google storage bucket name
BUCKET_NAME = "bucket_name"

# Maximum characters in each single line in the SRT subtitle file
MAX_CHARS = 60

# Location where your ffmpeg binary file is put
FFMPEG_LOCATION = "C:\\\Apps\\ffmpeg\\bin\\ffmpeg.exe"
FFPROBE_LOCATION = "C:\\\Apps\\ffmpeg\\bin\\ffprobe.exe"

Usage

python3 main.py example.mp4 en-US

Explanation of functions

upload_blob() - Uploads the media file to the Google Storaege bucket.

video_info() - Returns number of channels, bit rate, and sample rate of the video, extracted by running ffmpeg. These parameters are required by Google's API.

video_to_audio() - Converts video into audio, and upload the audio to the Google Storaege bucket.

long_running_recognize() - Transcribes the audio by calling Google Cloud API.

break_sentences() - Breaks sentences by punctuations and maximum sentence length. This ensures that in the video subtitle, the sentence won't be too long.

TODO

Handle non-English languages results

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.env.EXAMPLE		.env.EXAMPLE
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
credentials.json.EXAMPLE		credentials.json.EXAMPLE
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Google Cloud speech-to-text video transcribe subtitle generator

Requirements

Usage

Explanation of functions

TODO

Reference

About

Releases

Packages

Contributors 2

Languages

License

groundcat/Google-AI-video-transcribe-subtitle-generator

Folders and files

Latest commit

History

Repository files navigation

Google Cloud speech-to-text video transcribe subtitle generator

Requirements

Usage

Explanation of functions

TODO

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages