Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,121 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
title: REM_WASTE_INTERVIEW
|
3 |
emoji: 🎤
|
4 |
colorFrom: indigo
|
5 |
colorTo: red
|
6 |
sdk: streamlit
|
7 |
-
sdk_version: 1.
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
---
|
|
|
1 |
+
|
2 |
+
|
3 |
+
# English Accent Detection Tool
|
4 |
+
|
5 |
+
## Project Overview
|
6 |
+
|
7 |
+
This tool is a working proof-of-concept designed to evaluate spoken English in candidate video submissions. It automatically extracts audio from a public video or uploaded file, identifies whether the language spoken is English, and classifies the English accent (e.g., American, British, Australian). A confidence score is also provided to aid in candidate screening.
|
8 |
+
|
9 |
+
This submission was developed as part of the REM Waste hiring challenge, with emphasis on practicality, technical clarity, and clean design.
|
10 |
+
|
11 |
+
---
|
12 |
+
|
13 |
+
## Features
|
14 |
+
|
15 |
+
* Accepts public video URLs (e.g., Loom, MP4 links) or uploaded video/audio files.
|
16 |
+
* Extracts audio using `ffmpeg`.
|
17 |
+
* Detects the spoken language using `SpeechBrain`'s language identification model.
|
18 |
+
* If English is detected, simulates classification into common English accents.
|
19 |
+
* Outputs include:
|
20 |
+
|
21 |
+
* Accent classification
|
22 |
+
* Confidence score (0–100%)
|
23 |
+
* Brief summary
|
24 |
+
|
25 |
+
---
|
26 |
+
|
27 |
+
## Live Demo
|
28 |
+
|
29 |
+
Deployed Streamlit app (hosted on Streamlit Cloud):
|
30 |
+
|
31 |
+
**\[Live App URL – Insert Link Here]**
|
32 |
+
|
33 |
+
---
|
34 |
+
|
35 |
+
## Technology Stack
|
36 |
+
|
37 |
+
* **Python 3**
|
38 |
+
* **Streamlit** for the web interface
|
39 |
+
* **SpeechBrain** for spoken language identification
|
40 |
+
* **Torchaudio** for audio preprocessing
|
41 |
+
* **FFMPEG** for audio extraction
|
42 |
+
* **Requests, Matplotlib** for I/O and optional output handling
|
43 |
+
|
44 |
+
---
|
45 |
+
|
46 |
+
## How It Works
|
47 |
+
|
48 |
+
1. The user inputs a video URL or uploads a file.
|
49 |
+
2. The audio is extracted and resampled to a suitable format.
|
50 |
+
3. The system determines whether the speaker is using English.
|
51 |
+
4. If English is detected, the tool classifies the accent based on common linguistic traits.
|
52 |
+
5. The result includes:
|
53 |
+
|
54 |
+
* Accent label (e.g., British)
|
55 |
+
* Confidence score
|
56 |
+
* Explanation or notes
|
57 |
+
|
58 |
+
---
|
59 |
+
|
60 |
+
## Local Setup Instructions
|
61 |
+
|
62 |
+
1. Clone the repository:
|
63 |
+
|
64 |
+
```bash
|
65 |
+
git clone https://github.com/yourusername/english-accent-detector.git
|
66 |
+
cd english-accent-detector
|
67 |
+
```
|
68 |
+
|
69 |
+
2. Install dependencies:
|
70 |
+
|
71 |
+
```bash
|
72 |
+
pip install -r requirements.txt
|
73 |
+
```
|
74 |
+
|
75 |
+
3. Launch the app:
|
76 |
+
|
77 |
+
```bash
|
78 |
+
streamlit run app.py
|
79 |
+
```
|
80 |
+
|
81 |
+
---
|
82 |
+
|
83 |
+
## Requirements
|
84 |
+
|
85 |
+
```
|
86 |
+
streamlit
|
87 |
+
torch
|
88 |
+
torchaudio
|
89 |
+
speechbrain
|
90 |
+
ffmpeg-python
|
91 |
+
requests
|
92 |
+
matplotlib
|
93 |
+
```
|
94 |
+
|
95 |
+
---
|
96 |
+
|
97 |
+
## Notes
|
98 |
+
|
99 |
+
* Accent classification is simulated based on common accent features, due to the lack of an open-source, fine-grained English accent classifier.
|
100 |
+
* The core English language detection is handled by a pre-trained SpeechBrain model.
|
101 |
+
* This project was developed as a rapid prototype within the recommended 4–6 hour window and can be expanded into a production-grade system with access to more detailed accent datasets and APIs.
|
102 |
+
|
103 |
+
---
|
104 |
+
|
105 |
+
## Author
|
106 |
+
|
107 |
+
Developed by Edgar Muyale
|
108 |
+
For inquiries: edgarmuyale@gmail.com
|
109 |
+
Submission for REM Waste Hiring Challenge
|
110 |
+
|
111 |
+
|
112 |
---
|
113 |
title: REM_WASTE_INTERVIEW
|
114 |
emoji: 🎤
|
115 |
colorFrom: indigo
|
116 |
colorTo: red
|
117 |
sdk: streamlit
|
118 |
+
sdk_version: 1.45.1
|
119 |
app_file: app.py
|
120 |
pinned: false
|
121 |
---
|