Spaces:

ash-171
/

accent-detection

Sleeping

App Files Files Community

accent-detection / README.md

ash-171

Update README.md

43fea58 verified 13 days ago

preview code

raw

history blame contribute delete

4.5 kB

	---
	title: Accent Analyzer Agent
	emoji: 🏢
	colorFrom: red
	colorTo: red
	sdk: docker
	app_port: 8501
	tags:
	- streamlit
	pinned: false
	short_description: Various english accent detection
	license: mit
	---

	# Accent Analyzer

	This is a Streamlit-based web application that analyzes the English accent in spoken videos. Users can provide a public video URL (MP4), receive a transcription of the speech using Whisper Base, and ask follow-up questions based on the transcript using Gemma3:1b.

	## What It Does

	- Accepts a public MP4 video URL
	- Extracts audio and transcribes it using OpenAI Whisper Base
	- Detects accent using a Jzuluaga/accent-id-commonaccent_xlsr-en-english model
	- Lets users ask follow-up questions about the transcript using Gemma3
	- Deploys easily on Hugging Face Spaces with CPU

	---

	## Tech Stack

	- Streamlit — UI
	- OpenAI Whisper (base): For speech-to-text transcription.
	- Jzuluaga/accent-id-commonaccent_xlsr-en-english: For English accent classification.
	- Gemma3:1b via Ollama: For generating answers to follow-up questions using context from the transcript.
	- Docker — containerized for deployment
	- Hugging Face Spaces — for hosting with CPU

	---

	## Project Structure

	```
	accent-analyzer/
	├── Dockerfile # Container setup
	├── start.sh # Serving Ollama and app setup
	├── README.md # Instruction about the app
	├── requirements.txt # Python dependencies
	├── streamlit_app.py # Main UI app
	└── src/
	├── custome_interface.py # SpeechBrain custom interface
	├── tools/
	│ └── accent_tool.py # Audio analysis tool
	└── app/
	└── main_agent.py # Analysis + LLaMA agents
	```

	---

	## Running Locally (GPU Required)

	1. Clone the repo:

	```bash
	git clone https://huggingface.co/spaces/ash-171/accent-detection
	cd accent-analyzer
	```

	2. Build the Docker image:

	```bash
	docker build -t accent-analyzer .
	```

	3. Run the container:

	```bash
	docker run --gpus all -p 8501:8501 accent-analyzer
	```

	4. You can also run : `streamlit run streamlit_app.py` to deploy the app locally.

	5. Visit: [http://localhost:8501](http://localhost:8501)

	---


	## Requirements

	`requirements.txt` should include at least:

	```
	streamlit>=1.25.0
	requests==2.31.0
	pydub==0.25.1
	torch==1.11.0
	torchaudio==0.11.0
	speechbrain==0.5.12
	transformers==4.29.2
	asyncio==3.4.3
	ffmpeg-python==0.2.0
	openai-whisper==20230314
	numpy==1.22.4
	langchain>=0.1.0
	langchain-community>=0.0.30
	torchvision==0.12.0
	langgraph>=0.0.20

	```

	---

	## Notes

	- Gemma3:1b is accessed via Ollama inside Docker — ensure it pulls on build.
	- `custome_interface.py` is required by the accent model — it’s automatically downloaded in Dockerfile.
	- Video URLs must be direct links to `.mp4` files.

	---

	## Example Prompt

	```
	Analyze this video: https://www.learningcontainer.com/wp-content/uploads/2020/05/sample-mp4-file.mp4
	```

	Then follow up with:

	```
	Where is the speaker probably from?
	What is the tone or emotion?
	Summarize the video?
	```

	---
	## Acknowledgments

	This project uses the following models, frameworks, and tools:

	- [OpenAI Whisper](https://github.com/openai/whisper): Automatic speech recognition model.
	- [SpeechBrain](https://speechbrain.readthedocs.io/): Toolkit used for building and fine-tuning speech processing models.
	- [Accent-ID CommonAccent](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english): Fine-tuned wav2vec2 model hosted on Hugging Face for English accent classification.
	- [CustomEncoderWav2vec2Classifier](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english/blob/main/custom_interface.py): Custom interface used to load and run the accent model.
	- [Gemma3:1b](https://ollama.com/library/gemma3:1b) via [Ollama](https://ollama.com): Large language model used for natural language follow-up based on transcripts.
	- [Streamlit](https://streamlit.io): Python framework for building web applications.
	- [Hugging Face Spaces](https://huggingface.co/spaces): Platform used for deploying this application on GPU infrastructure.


	---
	## Note

	Due to unavailability of GPU the app will be extremely slow. The output has been test in local system and verified.

	---

	## Author

	- Developed by [Aswathi T S](https://github.com/ash-171)

	---

	## License

	This project is licensed under the `MIT License`.