accent-detector / README.md
Pheire's picture
Update README.md
c60174c verified
# πŸ—£οΈ Accent Identifier
This tool identifies the **speaker's accent** from a video or audio input. It supports uploads and URLs β€” including **direct `.mp4` links**, **Loom videos**, and **YouTube-style links** β€” and uses a deep learning model from [SpeechBrain](https://speechbrain.readthedocs.io/en/latest/index.html) for inference.
## πŸš€ Demo
Try it out live on [Hugging Face Spaces](https://pheire-accent-detector.hf.space) *(replace with your actual link)*.
---
## πŸ“¦ Features
* πŸŽ₯ Accepts video/audio uploads (`.mp4`, `.wav`, `.mp3`)
* 🌐 Handles direct URLs (e.g. Loom, direct `.mp4`, YouTube)
* 🧠 Classifies accent using `speechbrain` pretrained model
* πŸ“Š Returns top prediction and top-3 probabilities
* ⚑ Fast and easy UI built with [Gradio](https://gradio.app)
---
## πŸ§ͺ Example Inputs
* `https://www.loom.com/share/abc123`
* `https://yourdomain.com/sample.mp4`
* Uploaded audio: `voice_sample.wav`
---
## πŸ› οΈ Installation
```bash
git clone https://github.com/yourusername/accent-identifier.git
cd accent-identifier
# Create virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
# Install dependencies
pip install -r requirements.txt
```
### requirements.txt
```
speechbrain
gradio
torchaudio
torch
ffmpeg-python
yt-dlp
requests
```
Make sure `ffmpeg` is installed and available in your system path.
You can test with: `ffmpeg -version`
---
## ▢️ Run Locally
```bash
python app.py
```
This will launch a Gradio interface in your browser at `http://localhost:7860`.
---
## 🧠 Model Details
* **Model**: `Jzuluaga/accent-id-commonaccent_ecapa`
* **Framework**: [SpeechBrain](https://speechbrain.readthedocs.io/)
* **Classes**: US, UK, Australia, Canada, India, etc.
---
## πŸ“‚ Project Structure
```
accent-identifier/
β”œβ”€β”€ app.py # Main Gradio app
β”œβ”€β”€ requirements.txt # Dependencies
└── README.md # You are here
```
---
## 🧩 Notes
* Loom support relies on their internal API. It may break if they change the endpoint.
* Audio is extracted to `.wav` using `ffmpeg` with 16kHz mono format for model compatibility.
---
title: Accent Detector
emoji: 🏒
colorFrom: blue
colorTo: blue
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference