Spaces:
Sleeping
Sleeping
File size: 1,720 Bytes
63025d4 08f13f2 35dd02e 63025d4 d994d22 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
license: mit
title: Chatbot for Video Question Answering
sdk: gradio
emoji: π
pinned: false
short_description: A chatbot that can answer questions about a video.
python_version: 3.12.7
sdk_version: 5.35.0
---
# Chatbot for Video Question Answering Demo
AI chatbot that can answer questions about video content. This project leverages multi-modal LLM, multi-modal RAG pipeline to process video frames, transcribe audio, and retrieval information to provide accurate answers to questions about video content.
## Requirements
- Python 3.12+
- [uv](https://docs.astral.sh/uv/) for package and project manager
- [FFmpeg](https://ffmpeg.org/) installed and available in PATH
- [Google Gemini API key](https://aistudio.google.com/apikey) for the LLM functionality
## Installation
1. Clone this repository
```bash
git clone [repository-url]
cd VideoChatbot
```
2. Install dependencies using uv
```bash
uv sync
```
3. Create a `.env` file in the project root with your API key
```
GEMINI_API_KEY=your_api_key_here
```
## Usage
1. Start the application
```bash
python -m app.main
```
2. Access the UI through your browser (typically at http://127.0.0.1:7860)
3. Upload a video file or provide a YouTube URL and ask questions about it
4. The system will process the video (extract frames, transcribe audio), index the content, and then answer your questions
## Notes
This project is designed to be a demo and may require additional configuration for production use. The video processing and indexing can take time depending on the video length and complexity. Use a larger LLMs, embeddings, transcription models, and vector databases for better performance and accuracy. |