Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.44.1
license: mit
title: Chatbot for Video Question Answering
sdk: gradio
emoji: π
pinned: false
short_description: A chatbot that can answer questions about a video.
python_version: 3.12.7
sdk_version: 5.35.0
Chatbot for Video Question Answering Demo
AI chatbot that can answer questions about video content. This project leverages multi-modal LLM, multi-modal RAG pipeline to process video frames, transcribe audio, and retrieval information to provide accurate answers to questions about video content.
Requirements
- Python 3.12+
- uv for package and project manager
- FFmpeg installed and available in PATH
- Google Gemini API key for the LLM functionality
Installation
Clone this repository
git clone [repository-url] cd VideoChatbot
Install dependencies using uv
uv sync
Create a
.env
file in the project root with your API keyGEMINI_API_KEY=your_api_key_here
Usage
Start the application
python -m app.main
Access the UI through your browser (typically at http://127.0.0.1:7860)
Upload a video file or provide a YouTube URL and ask questions about it
The system will process the video (extract frames, transcribe audio), index the content, and then answer your questions
Notes
This project is designed to be a demo and may require additional configuration for production use. The video processing and indexing can take time depending on the video length and complexity. Use a larger LLMs, embeddings, transcription models, and vector databases for better performance and accuracy.