Skip to content

A Streamlit based professional workflow for Elevenlabs

License

Notifications You must be signed in to change notification settings

SkelegonDK/ElevenTools

Repository files navigation

ElevenTools Alpha v0.1.0

ElevenTools is a comprehensive toolbox for ElevenLabs, providing a user-friendly interface for text-to-speech generation with advanced features and bulk processing capabilities.

Recent Updates

  • Phonetics processing moved to Ollama: We've migrated the phonetics processing to use Ollama, a local language model, for improved performance and privacy.
  • Bug fix: Resolved an issue with the speaker boost parameter to ensure it functions correctly.

Features

  • Dynamic voice and model selection from the ElevenLabs library
  • Text variable support for personalized audio generation
  • Random and fixed seed options for reproducible results
  • Customizable voice settings (stability, similarity, style, speaker boost)
  • Single and bulk audio generation
  • CSV support for batch processing
  • Review and playback of generated audio
  • Ollama integration for local language model processing, including phonetics

Installation

  1. Ensure you have Python 3.10 or later installed.
  2. Clone this repository:
    git clone https://github.com/your-username/eleventools.git
    cd eleventools
  3. Install the required packages:
    pip install -r requirements.txt

Configuration

  1. Create a .streamlit/secrets.toml file in the root directory with your API key:
    ELEVENLABS_API_KEY = "your_elevenlabs_api_key"
  2. (Optional) Create a .streamlit/config.toml file to customize Streamlit's appearance and behavior.

Ollama Setup

ElevenTools integrates with Ollama for local language model processing, including phonetics. To use this feature, you need to install Ollama and download the appropriate model:

  1. Install Ollama:

    • For macOS and Linux:
      curl https://ollama.ai/install.sh | sh
    • For Windows: Download and install from Ollama's official website
  2. Download the required model: After installing Ollama, open a terminal and run:

    ollama pull llama3.1:8b

    This will download the small and efficient Llama 3.1:8b model, which is currently used by ElevenTools.

  3. Ensure Ollama is running: Ollama should start automatically after installation. If it's not running, you can start it manually:

    • On macOS/Linux: ollama serve
    • On Windows: Run the Ollama application

For more information on Ollama, visit ollama.ai.

Usage

Run the Streamlit app:

streamlit run app.py

Navigate to the provided local URL to access the ElevenTools interface.

Bulk Generation

  1. Prepare a CSV file with columns: 'text', 'filename' (optional), and any variables used in the text.
  2. Use the Bulk Generation page to upload your CSV and generate multiple audio files.
  3. Choose between random or fixed seed generation for consistent results.

TODO List for Eleven Tools

High Priority

  1. Implement automated testing

    • Unit tests for core functions
    • Integration tests for API interactions
    • End-to-end tests for user workflows
  2. Integrate OLLAMA

    • Implement OLLAMA integration in the codebase
    • Create tests for OLLAMA integration
    • Test and improve enhancing process
  3. Enhance UI/UX

    • Implement progress bars for audio generation
    • Improve error messaging and user feedback
    • Create a more intuitive layout for voice settings
  4. Optimize performance

    • Implement caching for frequently used data
    • Optimize bulk generation for large datasets
  5. Security enhancements

    • Implement proper API key management
    • Add user authentication for multi-user support

Medium Priority

  1. Improve data management

    • Implement a database for storing generation history
    • Create export options for generated audio metadata
    • Develop a pronunciation memory system
  2. Expand features

    • Add search functionality for voice IDs
  3. Documentation

    • Create comprehensive API documentation
    • Develop a user guide with examples and best practices

Lower Priority

  1. Enhance Voice-to-Voice functionality

    • Add voice cleanup features
  2. UI/UX improvements (continued)

    • Implement a dark mode option

Grouped by Feature Area

Testing and Quality Assurance

  • Implement automated testing
    • Unit tests for core functions
    • Integration tests for API interactions
    • End-to-end tests for user workflows

OLLAMA Integration

  • Integrate OLLAMA
    • Research OLLAMA API and integration requirements
    • Design integration architecture
    • Implement OLLAMA integration in the codebase
    • Create tests for OLLAMA integration

User Interface and Experience

  • Enhance UI/UX
    • Implement progress bars for audio generation
    • Improve error messaging and user feedback
    • Create a more intuitive layout for voice settings
    • Implement a dark mode option

Performance and Optimization

  • Optimize performance
    • Implement caching for frequently used data
    • Optimize bulk generation for large datasets

Security

  • Security enhancements
    • Implement proper API key management
    • Add user authentication for multi-user support

Data Management and Persistence

  • Improve data management
    • Implement a database for storing generation history
    • Create export options for generated audio metadata
    • Develop a pronunciation memory system

Feature Expansion

  • Expand features
    • Add search functionality for voice IDs
  • Enhance Voice-to-Voice functionality
    • Add voice cleanup features

Documentation

  • Documentation
    • Create comprehensive API documentation
    • Develop a user guide with examples and best practices

License

ElevenTools is open-source software released under a custom license.

  • Free for individual use and for companies with less than $10 million in annual revenue and fewer than 50 employees.
  • Commercial licensing required for larger companies.
  • Use for training AI models is prohibited without explicit permission.

Please see the full license for all terms and conditions.

For commercial licensing inquiries, please contact [your contact information].

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

If you encounter any problems or have any questions, please open an issue in this repository.