--- title: MCP Video Analysis with Llama 3 emoji: 🎥 colorFrom: purple colorTo: blue sdk: gradio sdk_version: 5.33.1 app_file: app.py pinned: false license: mit short_description: AI-powered video analysis with Llama 3 and Modal --- # 🎥 MCP Video Analysis with Llama 3 This application provides comprehensive video analysis using the Model Context Protocol (MCP) to integrate multiple AI technologies: ## 🔧 Technology Stack - **Modal Backend**: Scalable cloud compute for video processing - **Whisper**: Speech-to-text transcription - **Computer Vision Models**: Object detection, action recognition, and captioning - **Meta Llama 3**: Advanced AI for intelligent content analysis, hosted on Modal - **MCP Protocol**: Model Context Protocol for seamless integration ## 🎯 Features - **Transcription**: Extract spoken content from videos - **Visual Analysis**: Identify objects, actions, and scenes - **Content Understanding**: AI-powered insights and summaries - **Custom Queries**: Ask specific questions about video content ## 🚀 Usage 1. Enter a video URL (YouTube or direct link) 2. Optionally ask a specific question 3. Click "Analyze Video" to get comprehensive insights 4. Review both Llama 3's intelligent analysis and raw data ## 🔒 Environment Variables Required - `MODAL_LLAMA3_ENDPOINT_URL`: The URL for the deployed Llama 3 Modal service. - `MODAL_VIDEO_ANALYSIS_ENDPOINT_URL`: The URL for the video processing Modal service (optional, has a default value). Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference