REM_WASTE_INTERVIEW / README.md
EdgarDataScientist's picture
Update README.md
f0d9312 verified

A newer version of the Streamlit SDK is available: 1.45.1

Upgrade
metadata
sdk: streamlit

English Accent Detection Tool

Project Overview

This tool is a working proof-of-concept designed to evaluate spoken English in candidate video submissions. It automatically extracts audio from a public video or uploaded file, identifies whether the language spoken is English, and classifies the English accent (e.g., American, British, Australian). A confidence score is also provided to aid in candidate screening.

This submission was developed as part of the REM Waste hiring challenge, with emphasis on practicality, technical clarity, and clean design.


Features

  • Accepts public video URLs (e.g., Loom, MP4 links) or uploaded video/audio files.

  • Extracts audio using ffmpeg.

  • Detects the spoken language using SpeechBrain's language identification model.

  • If English is detected, simulates classification into common English accents.

  • Outputs include:

    • Accent classification
    • Confidence score (0–100%)
    • Brief summary

Live Demo

Deployed Streamlit app (hosted on Streamlit Cloud):

[Live App URL – Insert Link Here]


Technology Stack

  • Python 3
  • Streamlit for the web interface
  • SpeechBrain for spoken language identification
  • Torchaudio for audio preprocessing
  • FFMPEG for audio extraction
  • Requests, Matplotlib for I/O and optional output handling

How It Works

  1. The user inputs a video URL or uploads a file.

  2. The audio is extracted and resampled to a suitable format.

  3. The system determines whether the speaker is using English.

  4. If English is detected, the tool classifies the accent based on common linguistic traits.

  5. The result includes:

    • Accent label (e.g., British)
    • Confidence score
    • Explanation or notes

Local Setup Instructions

  1. Clone the repository:

    git clone https://github.com/yourusername/english-accent-detector.git
    cd english-accent-detector
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Launch the app:

    streamlit run app.py
    

Requirements

streamlit
torch
torchaudio
speechbrain
ffmpeg-python
requests
matplotlib

Notes

  • Accent classification is simulated based on common accent features, due to the lack of an open-source, fine-grained English accent classifier.
  • The core English language detection is handled by a pre-trained SpeechBrain model.
  • This project was developed as a rapid prototype within the recommended 4–6 hour window and can be expanded into a production-grade system with access to more detailed accent datasets and APIs.

Author

Developed by Edgar Muyale For inquiries: edgarmuyale@gmail.com Submission for REM Waste Hiring Challenge