A newer version of the Streamlit SDK is available:
1.45.1
sdk: streamlit
English Accent Detection Tool
Project Overview
This tool is a working proof-of-concept designed to evaluate spoken English in candidate video submissions. It automatically extracts audio from a public video or uploaded file, identifies whether the language spoken is English, and classifies the English accent (e.g., American, British, Australian). A confidence score is also provided to aid in candidate screening.
This submission was developed as part of the REM Waste hiring challenge, with emphasis on practicality, technical clarity, and clean design.
Features
Accepts public video URLs (e.g., Loom, MP4 links) or uploaded video/audio files.
Extracts audio using
ffmpeg
.Detects the spoken language using
SpeechBrain
's language identification model.If English is detected, simulates classification into common English accents.
Outputs include:
- Accent classification
- Confidence score (0–100%)
- Brief summary
Live Demo
Deployed Streamlit app (hosted on Streamlit Cloud):
[Live App URL – Insert Link Here]
Technology Stack
- Python 3
- Streamlit for the web interface
- SpeechBrain for spoken language identification
- Torchaudio for audio preprocessing
- FFMPEG for audio extraction
- Requests, Matplotlib for I/O and optional output handling
How It Works
The user inputs a video URL or uploads a file.
The audio is extracted and resampled to a suitable format.
The system determines whether the speaker is using English.
If English is detected, the tool classifies the accent based on common linguistic traits.
The result includes:
- Accent label (e.g., British)
- Confidence score
- Explanation or notes
Local Setup Instructions
Clone the repository:
git clone https://github.com/yourusername/english-accent-detector.git cd english-accent-detector
Install dependencies:
pip install -r requirements.txt
Launch the app:
streamlit run app.py
Requirements
streamlit
torch
torchaudio
speechbrain
ffmpeg-python
requests
matplotlib
Notes
- Accent classification is simulated based on common accent features, due to the lack of an open-source, fine-grained English accent classifier.
- The core English language detection is handled by a pre-trained SpeechBrain model.
- This project was developed as a rapid prototype within the recommended 4–6 hour window and can be expanded into a production-grade system with access to more detailed accent datasets and APIs.
Author
Developed by Edgar Muyale For inquiries: edgarmuyale@gmail.com Submission for REM Waste Hiring Challenge