metadata

title: CineeeeAi
emoji: 🚀
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
  - streamlit
pinned: false
short_description: Streamlit template space

CineGen AI Ultra+ 🎬✨

Visionary Cinematic Pre-Production Powered by AI

CineGen AI Ultra+ is a Streamlit web application designed to accelerate the creative pre-production process for films, animations, and other visual storytelling projects. It leverages the power of Large Language Models (Google's Gemini) and other AI tools to transform a simple story idea into a rich cinematic treatment, complete with scene breakdowns, visual style suggestions, AI-generated concept art/video clips, and a narrated animatic.

Features

AI Creative Director: Input a core story idea, genre, and mood.
Cinematic Treatment Generation:
- Gemini generates a detailed multi-scene treatment.
- Each scene includes:
  - Title, Emotional Beat, Setting Description
  - Characters Involved, Character Focus Moment
  - Key Plot Beat, Suggested Dialogue Hook
  - Proactive Director's Suggestions (감독 - Gamdok/Director): Visual Style, Camera Work, Sound Design.
  - Asset Generation Aids: Suggested Asset Type (Image/Video Clip), Video Motion Description & Duration, Image/Video Keywords, Pexels Search Queries.
Visual Asset Generation:
- Image Generation: Utilizes DALL-E 3 (via OpenAI API) to generate concept art for scenes based on derived prompts.
- Stock Footage Fallback: Uses Pexels API for relevant stock images/videos if AI generation is disabled or fails.
- Video Clip Generation (Placeholder): Integrated structure for text-to-video/image-to-video generation using RunwayML API (requires user to implement actual API calls in core/visual_engine.py). Placeholder generates dummy video clips.
Character Definition: Define key characters with visual descriptions for more consistent AI-generated visuals.
Global Style Overrides: Apply global stylistic keywords (e.g., "Hyper-Realistic Gritty Noir," "Vintage Analog Sci-Fi") to influence visual generation.
AI-Powered Narration:
- Gemini crafts a narration script based on the generated treatment.
- ElevenLabs API synthesizes the narration into natural-sounding audio.
- Customizable voice ID and narration style.
Iterative Refinement:
- Edit scene treatments and regenerate them with AI assistance.
- Refine DALL-E prompts based on feedback and regenerate visuals.
Cinematic Animatic Assembly:
- Combines generated visual assets (images/video clips) and the synthesized narration into a downloadable .mp4 animatic.
- Customizable per-scene duration for pacing control.
- Ken Burns effect on still images and text overlays for scene context.
Secrets Management: Securely loads API keys from Streamlit secrets or environment variables.

Project Structure

Use code with caution. Markdown CineGenAI/ ├── .streamlit/ │ └── secrets.toml # API Keys and configuration (DO NOT COMMIT if public) ├── assets/ │ └── fonts/ │ └── arial.ttf # Example font file (ensure it's available or update path) ├── core/ │ ├── init.py │ ├── gemini_handler.py # Manages interactions with the Gemini API │ ├── visual_engine.py # Handles image/video generation (DALL-E, Pexels, RunwayML placeholder) and video assembly │ └── prompt_engineering.py # Contains functions to craft detailed prompts for Gemini ├── temp_cinegen_media/ # Temporary directory for generated media (add to .gitignore) ├── app.py # Main Streamlit application script ├── Dockerfile # For containerizing the application ├── Dockerfile.test # (Optional) For testing ├── requirements.txt # Python dependencies ├── README.md # This file └── .gitattributes # For Git LFS if handling large font files

Setup and Installation

Clone the Repository:

git clone <repository_url>
cd CineGenAI

Create a Virtual Environment (Recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Dependencies:
```
pip install -r requirements.txt
```
Note: MoviePy might require ffmpeg to be installed on your system. On Debian/Ubuntu: sudo apt-get install ffmpeg. On macOS with Homebrew: brew install ffmpeg.
Set Up API Keys: You need API keys for the following services:
- Google Gemini API
- OpenAI API (for DALL-E)
- ElevenLabs API (and optionally a specific Voice ID)
- Pexels API
- RunwayML API (if implementing full video generation)
Store these keys securely. You have two primary options:
- Streamlit Secrets (Recommended for Hugging Face Spaces / Streamlit Cloud): Create a file .streamlit/secrets.toml (make sure this file is in your .gitignore if your repository is public!) with the following format:
```
GEMINI_API_KEY = "your_gemini_api_key"
OPENAI_API_KEY = "your_openai_api_key"
ELEVENLABS_API_KEY = "your_elevenlabs_api_key"
PEXELS_API_KEY = "your_pexels_api_key"
ELEVENLABS_VOICE_ID = "your_elevenlabs_voice_id" # e.g., "Rachel" or a custom ID
RUNWAY_API_KEY = "your_runwayml_api_key"
```
- Environment Variables (for local development): Set the environment variables directly in your terminal or .env file (using a library like python-dotenv which is not included by default). The application will look for these if Streamlit secrets are not found.
```
export GEMINI_API_KEY="your_gemini_api_key"
export OPENAI_API_KEY="your_openai_api_key"
# ... and so on for other keys
```
Font: Ensure the font file specified in core/visual_engine.py (e.g., arial.ttf) is accessible. The script tries common system paths, but you can place it in assets/fonts/ and adjust the path in visual_engine.py if needed. If using Docker, ensure the font is copied into the image (see Dockerfile).
RunwayML Implementation (Important): The current integration for RunwayML in core/visual_engine.py (method _generate_video_clip_with_runwayml) is a placeholder. You will need to:
- Install the official RunwayML SDK if available.
- Implement the actual API calls to RunwayML for text-to-video or image-to-video generation.
- The placeholder currently creates a dummy video clip using MoviePy.

Running the Application

Local Development:
```
streamlit run app.py
```
The application should open in your web browser.
Using Docker (Optional):
- Build the Docker image:
```
docker build -t cinegen-ai .
```
- Run the Docker container (ensure API keys are passed as environment variables or handled via mounted secrets if not baked into the image for local testing):
```
docker run -p 8501:8501 \
  -e GEMINI_API_KEY="your_key" \
  -e OPENAI_API_KEY="your_key" \
  # ... other env vars ...
  cinegen-ai
```
  Access the app at http://localhost:8501.

How to Use

Input Creative Seed: Provide your core story idea, select a genre, mood, number of scenes, and AI director style in the sidebar.
Generate Treatment: Click "🌌 Generate Cinematic Treatment". The AI will produce a multi-scene breakdown.
Review & Refine:
- Examine each scene's details, including AI-generated visuals (or placeholders).
- Use the "✏️ Edit Scene X Treatment" and "🎨 Edit Scene X Visual Prompt" popovers to provide feedback and regenerate specific parts of the treatment or visuals.
- Adjust per-scene "Dominant Shot Type" and "Scene Duration" for the animatic.
Fine-Tuning (Sidebar):
- Define characters with visual descriptions.
- Apply global style overrides.
- Set narrator voice ID and narration style.
Assemble Animatic: Once you're satisfied with the treatment, visuals, and narration script (generated automatically), click "🎬 Assemble Narrated Cinematic Animatic".
View & Download: The generated animatic video will appear, and you can download it.

Key Technologies

Python
Streamlit: Web application framework.
Google Gemini API: For core text generation (treatment, narration script, prompt refinement).
OpenAI API (DALL-E 3): For AI image generation.
ElevenLabs API: For text-to-speech narration.
Pexels API: For stock image/video fallbacks.
RunwayML API (Placeholder): For AI video clip generation.
MoviePy: For video processing and animatic assembly.
Pillow (PIL): For image manipulation.

Future Enhancements / To-Do

Implement full, robust RunwayML API integration.
Option to upload custom seed images for image-to-video generation.
More sophisticated control over Ken Burns effect (pan direction, zoom intensity).
Allow users to upload their own audio for narration or background music.
Advanced shot list generation and export.
Integration with other AI video/image models.
User accounts and project saving.
More granular error handling and user feedback in the UI.
Refine JSON cleaning from Gemini to be even more robust.

Contributing

Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request.

License

This project is currently under [Specify License Here - e.g., MIT, Apache 2.0, or state as private/proprietary].

This README provides a comprehensive overview. Ensure all paths, commands, and API key instructions match your specific project setup.