EdgarDataScientist commited on
Commit
d074961
·
verified ·
1 Parent(s): 8cdbd03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +112 -1
README.md CHANGED
@@ -1,10 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: REM_WASTE_INTERVIEW
3
  emoji: 🎤
4
  colorFrom: indigo
5
  colorTo: red
6
  sdk: streamlit
7
- sdk_version: 1.31.1
8
  app_file: app.py
9
  pinned: false
10
  ---
 
1
+
2
+
3
+ # English Accent Detection Tool
4
+
5
+ ## Project Overview
6
+
7
+ This tool is a working proof-of-concept designed to evaluate spoken English in candidate video submissions. It automatically extracts audio from a public video or uploaded file, identifies whether the language spoken is English, and classifies the English accent (e.g., American, British, Australian). A confidence score is also provided to aid in candidate screening.
8
+
9
+ This submission was developed as part of the REM Waste hiring challenge, with emphasis on practicality, technical clarity, and clean design.
10
+
11
+ ---
12
+
13
+ ## Features
14
+
15
+ * Accepts public video URLs (e.g., Loom, MP4 links) or uploaded video/audio files.
16
+ * Extracts audio using `ffmpeg`.
17
+ * Detects the spoken language using `SpeechBrain`'s language identification model.
18
+ * If English is detected, simulates classification into common English accents.
19
+ * Outputs include:
20
+
21
+ * Accent classification
22
+ * Confidence score (0–100%)
23
+ * Brief summary
24
+
25
+ ---
26
+
27
+ ## Live Demo
28
+
29
+ Deployed Streamlit app (hosted on Streamlit Cloud):
30
+
31
+ **\[Live App URL – Insert Link Here]**
32
+
33
+ ---
34
+
35
+ ## Technology Stack
36
+
37
+ * **Python 3**
38
+ * **Streamlit** for the web interface
39
+ * **SpeechBrain** for spoken language identification
40
+ * **Torchaudio** for audio preprocessing
41
+ * **FFMPEG** for audio extraction
42
+ * **Requests, Matplotlib** for I/O and optional output handling
43
+
44
+ ---
45
+
46
+ ## How It Works
47
+
48
+ 1. The user inputs a video URL or uploads a file.
49
+ 2. The audio is extracted and resampled to a suitable format.
50
+ 3. The system determines whether the speaker is using English.
51
+ 4. If English is detected, the tool classifies the accent based on common linguistic traits.
52
+ 5. The result includes:
53
+
54
+ * Accent label (e.g., British)
55
+ * Confidence score
56
+ * Explanation or notes
57
+
58
+ ---
59
+
60
+ ## Local Setup Instructions
61
+
62
+ 1. Clone the repository:
63
+
64
+ ```bash
65
+ git clone https://github.com/yourusername/english-accent-detector.git
66
+ cd english-accent-detector
67
+ ```
68
+
69
+ 2. Install dependencies:
70
+
71
+ ```bash
72
+ pip install -r requirements.txt
73
+ ```
74
+
75
+ 3. Launch the app:
76
+
77
+ ```bash
78
+ streamlit run app.py
79
+ ```
80
+
81
+ ---
82
+
83
+ ## Requirements
84
+
85
+ ```
86
+ streamlit
87
+ torch
88
+ torchaudio
89
+ speechbrain
90
+ ffmpeg-python
91
+ requests
92
+ matplotlib
93
+ ```
94
+
95
+ ---
96
+
97
+ ## Notes
98
+
99
+ * Accent classification is simulated based on common accent features, due to the lack of an open-source, fine-grained English accent classifier.
100
+ * The core English language detection is handled by a pre-trained SpeechBrain model.
101
+ * This project was developed as a rapid prototype within the recommended 4–6 hour window and can be expanded into a production-grade system with access to more detailed accent datasets and APIs.
102
+
103
+ ---
104
+
105
+ ## Author
106
+
107
+ Developed by Edgar Muyale
108
+ For inquiries: edgarmuyale@gmail.com
109
+ Submission for REM Waste Hiring Challenge
110
+
111
+
112
  ---
113
  title: REM_WASTE_INTERVIEW
114
  emoji: 🎤
115
  colorFrom: indigo
116
  colorTo: red
117
  sdk: streamlit
118
+ sdk_version: 1.45.1
119
  app_file: app.py
120
  pinned: false
121
  ---