metadata

sdk: streamlit

English Accent Detection Tool

Project Overview

This tool is a working proof-of-concept designed to evaluate spoken English in candidate video submissions. It automatically extracts audio from a public video or uploaded file, identifies whether the language spoken is English, and classifies the English accent (e.g., American, British, Australian). A confidence score is also provided to aid in candidate screening.

This submission was developed as part of the REM Waste hiring challenge, with emphasis on practicality, technical clarity, and clean design.

Features

Accepts public video URLs (e.g., Loom, MP4 links) or uploaded video/audio files.
Extracts audio using ffmpeg.
Detects the spoken language using SpeechBrain's language identification model.
If English is detected, simulates classification into common English accents.
Outputs include:
- Accent classification
- Confidence score (0–100%)
- Brief summary

Live Demo

Deployed Streamlit app (hosted on Streamlit Cloud):

[Live App URL – Insert Link Here]

Technology Stack

Python 3
Streamlit for the web interface
SpeechBrain for spoken language identification
Torchaudio for audio preprocessing
FFMPEG for audio extraction
Requests, Matplotlib for I/O and optional output handling

How It Works

The user inputs a video URL or uploads a file.
The audio is extracted and resampled to a suitable format.
The system determines whether the speaker is using English.
If English is detected, the tool classifies the accent based on common linguistic traits.
The result includes:
- Accent label (e.g., British)
- Confidence score
- Explanation or notes

Local Setup Instructions

Clone the repository:

git clone https://github.com/yourusername/english-accent-detector.git
cd english-accent-detector

Install dependencies:
```
pip install -r requirements.txt
```
Launch the app:
```
streamlit run app.py
```

Requirements

streamlit
torch
torchaudio
speechbrain
ffmpeg-python
requests
matplotlib

Notes

Accent classification is simulated based on common accent features, due to the lack of an open-source, fine-grained English accent classifier.
The core English language detection is handled by a pre-trained SpeechBrain model.
This project was developed as a rapid prototype within the recommended 4–6 hour window and can be expanded into a production-grade system with access to more detailed accent datasets and APIs.

Author

Developed by Edgar Muyale For inquiries: edgarmuyale@gmail.com Submission for REM Waste Hiring Challenge