Spaces:
Running

Hume AI | Expressive TTS Arena
An interactive platform for comparing and evaluating the expressiveness of different text-to-speech engines
Overview
Expressive TTS Arena is an open-source web application that enables users to compare text-to-speech outputs with a focus on expressiveness rather than just audio quality. Built with Gradio, it provides a seamless interface for generating and comparing speech synthesis from different providers, including Hume AI and ElevenLabs.
Features
- Text generation using Claude AI for creating expressive content.
- Direct text input or AI-assisted text generation.
- Comparative analysis of different TTS engines.
- Simple voting mechanism for preferred outputs.
- Random voice selection from multiple providers.
- Real-time speech synthesis comparison.
Prerequisites
- Python >=3.11.11
- pip >=25.0
- Virtual environment capability
- API keys for Hume AI, Anthropic, and ElevenLabs
- For a complete list of dependencies, see requirements.
Project Structure
Expressive TTS Arena/
βββ src/
β βββ integrations/
β β βββ __init__.py # Makes integrations a package; exposes API clients
β β βββ anthropic_api.py # Anthropic API integration
β β βββ elevenlabs_api.py # ElevenLabs API integration
β β βββ hume_api.py # Hume API integration
β βββ __init__.py # Makes src a package; exposes key functionality
β βββ app.py # Entry file
β βββ config.py # Global config and logger setup
β βββ constants.py # Global constants
β βββ theme.py # Custom Gradio Theme
β βββ utils.py # Utility functions
βββ .env.example
βββ .gitignore
βββ .pre-commit-config.yaml
βββ requirements.txt
Installation
Create and activate the virtual environment:
Mac/Linux
python -m venv gradio-env source gradio-env/bin/activate
Windows
python -m venv gradio-env gradio-env\Scripts\activate
Install dependencies:
pip install -r requirements.txt
(Optional) If contributing, install pre-commit hook for automatic file formatting:
pre-commit install
Configure environment variables:
- Create a
.env
file based on.env.example
- Add your API keys:
HUME_API_KEY=YOUR_HUME_API_KEY ANTHROPIC_API_KEY=YOUR_ANTHROPIC_API_KEY ELEVENLABS_API_KEY=YOUR_ELEVENLABS_API_KEY
- Create a
Run the application:
watchfiles "python -m src.app"
Test the application by navigating to the the localhost URL in your browser (e.g. localhost:7860 or http://127.0.0.1:7860)
User Flow
- Enter or Generate Text: Type directly in the Text box, or optionally enter a Prompt, click "Generate text", and edit if needed.
- Synthesize Speech: Click "Synthesize speech" to generate two audio outputs.
- Listen & Compare: Playback both options (A & B) to hear the differences.
- Vote for Your Favorite: Click "Vote for option A" or "Vote for option B" to choose your favorite.
License
This project is licensed under the MIT License - see the LICENSE.txt file for details.