GroqCasters is a Python application that generates podcast scripts and corresponding audio using AI technologies. It leverages PocketGroq for script generation and Bark for text-to-speech conversion, allowing for custom voice cloning.
- Installation
- Usage
- Configuration
- Custom Voice Samples
- Operational Parameters
- Dependencies
- Additional Resources
-
Clone the repository:
git clone https://github.com/yourusername/groqcasters.git cd groqcasters
-
Create a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up your GROQ API key as an environment variable:
export GROQ_API_KEY=your_api_key_here
On Windows, use
set GROQ_API_KEY=your_api_key_here
GroqCasters can be used in two modes:
-
Generate a new script and audio:
python groqcasters.py path/to/input_text.txt path/to/output_directory
-
Use a pre-written script:
python groqcasters.py path/to/script.txt path/to/output_directory --use-script
The generated audio will be saved as full_podcast.wav
in the specified output directory.
Create a config.py
file in the project directory with the following content:
DEFAULT_MODEL = "your_default_model_name"
MAX_TOKENS = {
"outline": 8000,
"full_script": 8000,
"dialogue": 8000
}
HOST_PROFILES = """
Host1 (Rachel): Enthusiastic, prone to personal anecdotes.
Host2 (Mike): More analytical, enjoys making pop culture references.
"""
OUTLINE_PROMPT_TEMPLATE = "Your outline prompt template here"
EXPAND_PROMPT_TEMPLATE = "Your expand prompt template here"
DIALOGUE_PROMPT_TEMPLATE = "Your dialogue prompt template here"
Adjust these values according to your needs.
To use custom voice samples:
- Prepare a short (15-30 second) clear audio clip of the desired voice.
- Save the audio as a WAV file.
- Update the
CUSTOM_MALE_VOICE_PATH
andCUSTOM_FEMALE_VOICE_PATH
variables ingroqcasters.py
with the paths to your custom voice files.
SUNO_USE_SMALL_MODELS
: Set to "True" to use smaller Bark models (default: True)SUNO_OFFLOAD_CPU
: Set to "True" to offload processing to CPU if GPU is unavailable (default: False)MALE_VOICE_PRESET
: Default male voice preset for Bark (default: "v2/en_speaker_6")FEMALE_VOICE_PRESET
: Default female voice preset for Bark (default: "v2/en_speaker_9")
These can be adjusted in the groqcasters.py
file.
See the requirements.txt
file for a full list of dependencies. Key dependencies include:
- pocketgroq
- bark
- torch
- numpy
- scipy
GroqCasters is initially designed for two characters, but you can extend it to support more. Here's a guide on how to modify the application for multiple characters:
-
Update the
config.py
file:- Extend the
HOST_PROFILES
to include additional characters:HOST_PROFILES = """ Host1 (Rachel): Enthusiastic, prone to personal anecdotes. Host2 (Mike): More analytical, enjoys making pop culture references. Host3 (Alex): Tech-savvy, often explains complex concepts. Host4 (Sarah): Creative, brings in artistic perspectives. """
- Extend the
-
Modify the
groqcasters.py
file:- Add voice presets for new characters:
VOICE_PRESETS = { "rachel": "v2/en_speaker_9", "mike": "v2/en_speaker_6", "alex": "v2/en_speaker_2", "sarah": "v2/en_speaker_4" }
- Update the
_create_voice_prompt
method to handle multiple custom voices:def __init__(self): # ... existing code ... self.custom_voices = { "rachel": self._create_voice_prompt("path/to/rachel_voice.wav"), "mike": self._create_voice_prompt("path/to/mike_voice.wav"), "alex": self._create_voice_prompt("path/to/alex_voice.wav"), "sarah": self._create_voice_prompt("path/to/sarah_voice.wav") }
- Modify the
generate_audio_from_script
method to use the new voice selection:def generate_audio_from_script(self, script, output_dir): lines = script.split('\n') audio_segments = [] for line in lines: if line.strip(): speaker, text = line.split(':', 1) speaker = speaker.strip().lower() text = text.strip() voice = self.custom_voices.get(speaker) or VOICE_PRESETS.get(speaker, VOICE_PRESETS["rachel"]) try: # ... existing audio generation code ... except Exception as e: print(f"Error generating audio for line: {line}") print(f"Error details: {e}") # ... rest of the method remains the same ...
- Add voice presets for new characters:
-
Update the script generation prompts:
- Modify the
OUTLINE_PROMPT_TEMPLATE
,EXPAND_PROMPT_TEMPLATE
, andDIALOGUE_PROMPT_TEMPLATE
inconfig.py
to include instructions for handling multiple characters.
- Modify the
-
Adjust the script parsing:
- If your input scripts have a specific format for multiple speakers, update the script parsing logic in
generate_audio_from_script
to handle this format correctly.
- If your input scripts have a specific format for multiple speakers, update the script parsing logic in
-
Test thoroughly:
- Create test scripts with multiple characters to ensure the system handles them correctly.
- Generate audio for these test scripts and verify that each character has the correct voice.
Remember to update any other parts of the code that might assume a two-character setup, such as any hardcoded references to "Rachel" or "Mike".
By following these steps, you can extend GroqCasters to support as many characters as you need. This allows for creating more diverse and dynamic podcast scripts with a wider range of voices and personalities.
For more information on using Bark and PocketGroq, refer to their respective documentation:
Contributing Contributions to GroqCasters are welcome! If you encounter any problems, have feature suggestions, or want to improve the codebase, feel free to:
Open issues on the GitHub repository. Submit pull requests with bug fixes or new features. Improve documentation or add examples. When contributing, please:
Follow the existing code style and conventions. Write clear commit messages. Add or update tests for new features or bug fixes. Update the README if you're adding new functionality. License This project is licensed under the MIT License. When using any of Gravelle's code in your projects, please include a mention of "J. Gravelle" in your code and/or documentation. He's kinda full of himself, but he'd appreciate the acknowledgment.
For issues and feature requests, please open an issue on the GitHub repository.
GroqCasters is proudly open-source and released under the MIT License.
Thank you for choosing GroqCasters. We are committed to redefining the boundaries of what AI can achieve.
Copyright (c)2024 J. Gravelle Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
-
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
-
Any modifications made to the Software must clearly indicate that they are derived from the original work, and the name of the original author (J. Gravelle) must remain intact.
-
Redistributions of the Software in source code form must also include a prominent notice that the code has been modified from the original.
THE SOFTWARE IS PROVIDED "AS IS," WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.