Hume AI | Expressive TTS Arena

An interactive platform for comparing and evaluating the expressiveness of different text-to-speech models

Overview

Expressive TTS Arena is an open-source web application that enables users to compare text-to-speech outputs with a focus on expressiveness rather than just audio quality. Built with Gradio, it provides a seamless interface for generating and comparing speech synthesis from different providers, including Hume AI and ElevenLabs.

Features

Text generation using Claude 3.5 Sonnet by Anthropic for creating expressive content.
Direct text input or AI-assisted text generation.
Comparative analysis of different TTS outputs.
Simple voting mechanism for preferred outputs.

Prerequisites

Python >=3.11.11
pip >=25.0
Virtual environment capability
API keys for Hume AI, Anthropic, and ElevenLabs
For a complete list of dependencies, see requirements.txt.

Project Structure

Expressive TTS Arena/
├── src/
│   ├── integrations/
│   │   ├── __init__.py         # Makes integrations a package; exposes API clients
│   │   ├── anthropic_api.py    # Anthropic API integration
│   │   ├── elevenlabs_api.py   # ElevenLabs API integration
│   │   └── hume_api.py         # Hume API integration
│   ├── __init__.py             # Makes src a package; exposes key functionality
│   ├── app.py                  # Entry file
│   ├── config.py               # Global config and logger setup
│   ├── constants.py            # Global constants
│   ├── theme.py                # Custom Gradio Theme
│   └── utils.py                # Utility functions
├── .env.example
├── .gitignore
├── .pre-commit-config.yaml
└── requirements.txt

Installation

This project uses the uv package manager. Follow the installation instructions for your platform here.

Configure environment variables:

Create a .env file based on .env.example
Add your API keys:

HUME_API_KEY=YOUR_HUME_API_KEY
ANTHROPIC_API_KEY=YOUR_ANTHROPIC_API_KEY
ELEVENLABS_API_KEY=YOUR_ELEVENLABS_API_KEY

Run the application:

Standard

uv run python -m src.app

With hot-reloading

uv run watchfiles "python -m src.app" src

Test the application by navigating to the the localhost URL in your browser (e.g. localhost:7860 or http://127.0.0.1:7860)
(Optional) If contributing, install pre-commit hook for automatic file formatting:
```
uv run pre-commit install
```

User Flow

Enter or Generate Text: Type directly in the Text box, or optionally enter a Character description, click "Generate text", and edit if needed.
Synthesize Speech: Click "Synthesize speech" to generate two audio outputs.
Listen & Compare: Playback both options (A & B) to hear the differences.
Vote for Your Favorite: Click "Vote for option A" or "Vote for option B" to choose your favorite.

License

This project is licensed under the MIT License - see the LICENSE.txt file for details.