CingenAI / README.md
mgbam's picture
Update README.md
17602ca verified
---
title: CineeeeAi
emoji: πŸš€
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
---
# CineGen AI Ultra+ 🎬✨
**Visionary Cinematic Pre-Production Powered by AI**
CineGen AI Ultra+ is a Streamlit web application designed to accelerate the creative pre-production process for films, animations, and other visual storytelling projects. It leverages the power of Large Language Models (Google's Gemini) and other AI tools to transform a simple story idea into a rich cinematic treatment, complete with scene breakdowns, visual style suggestions, AI-generated concept art/video clips, and a narrated animatic.
## Features
* **AI Creative Director:** Input a core story idea, genre, and mood.
* **Cinematic Treatment Generation:**
* Gemini generates a detailed multi-scene treatment.
* Each scene includes:
* Title, Emotional Beat, Setting Description
* Characters Involved, Character Focus Moment
* Key Plot Beat, Suggested Dialogue Hook
* **Proactive Director's Suggestions (감독 - Gamdok/Director):** Visual Style, Camera Work, Sound Design.
* **Asset Generation Aids:** Suggested Asset Type (Image/Video Clip), Video Motion Description & Duration, Image/Video Keywords, Pexels Search Queries.
* **Visual Asset Generation:**
* **Image Generation:** Utilizes DALL-E 3 (via OpenAI API) to generate concept art for scenes based on derived prompts.
* **Stock Footage Fallback:** Uses Pexels API for relevant stock images/videos if AI generation is disabled or fails.
* **Video Clip Generation (Placeholder):** Integrated structure for text-to-video/image-to-video generation using RunwayML API (requires user to implement actual API calls in `core/visual_engine.py`). Placeholder generates dummy video clips.
* **Character Definition:** Define key characters with visual descriptions for more consistent AI-generated visuals.
* **Global Style Overrides:** Apply global stylistic keywords (e.g., "Hyper-Realistic Gritty Noir," "Vintage Analog Sci-Fi") to influence visual generation.
* **AI-Powered Narration:**
* Gemini crafts a narration script based on the generated treatment.
* ElevenLabs API synthesizes the narration into natural-sounding audio.
* Customizable voice ID and narration style.
* **Iterative Refinement:**
* Edit scene treatments and regenerate them with AI assistance.
* Refine DALL-E prompts based on feedback and regenerate visuals.
* **Cinematic Animatic Assembly:**
* Combines generated visual assets (images/video clips) and the synthesized narration into a downloadable `.mp4` animatic.
* Customizable per-scene duration for pacing control.
* Ken Burns effect on still images and text overlays for scene context.
* **Secrets Management:** Securely loads API keys from Streamlit secrets or environment variables.
## Project Structure
Use code with caution.
Markdown
CineGenAI/
β”œβ”€β”€ .streamlit/
β”‚ └── secrets.toml # API Keys and configuration (DO NOT COMMIT if public)
β”œβ”€β”€ assets/
β”‚ └── fonts/
β”‚ └── arial.ttf # Example font file (ensure it's available or update path)
β”œβ”€β”€ core/
β”‚ β”œβ”€β”€ init.py
β”‚ β”œβ”€β”€ gemini_handler.py # Manages interactions with the Gemini API
β”‚ β”œβ”€β”€ visual_engine.py # Handles image/video generation (DALL-E, Pexels, RunwayML placeholder) and video assembly
β”‚ └── prompt_engineering.py # Contains functions to craft detailed prompts for Gemini
β”œβ”€β”€ temp_cinegen_media/ # Temporary directory for generated media (add to .gitignore)
β”œβ”€β”€ app.py # Main Streamlit application script
β”œβ”€β”€ Dockerfile # For containerizing the application
β”œβ”€β”€ Dockerfile.test # (Optional) For testing
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # This file
└── .gitattributes # For Git LFS if handling large font files
## Setup and Installation
1. **Clone the Repository:**
```bash
git clone <repository_url>
cd CineGenAI
```
2. **Create a Virtual Environment (Recommended):**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. **Install Dependencies:**
```bash
pip install -r requirements.txt
```
*Note: `MoviePy` might require `ffmpeg` to be installed on your system. On Debian/Ubuntu: `sudo apt-get install ffmpeg`. On macOS with Homebrew: `brew install ffmpeg`.*
4. **Set Up API Keys:**
You need API keys for the following services:
* Google Gemini API
* OpenAI API (for DALL-E)
* ElevenLabs API (and optionally a specific Voice ID)
* Pexels API
* RunwayML API (if implementing full video generation)
Store these keys securely. You have two primary options:
* **Streamlit Secrets (Recommended for Hugging Face Spaces / Streamlit Cloud):**
Create a file `.streamlit/secrets.toml` (make sure this file is in your `.gitignore` if your repository is public!) with the following format:
```toml
GEMINI_API_KEY = "your_gemini_api_key"
OPENAI_API_KEY = "your_openai_api_key"
ELEVENLABS_API_KEY = "your_elevenlabs_api_key"
PEXELS_API_KEY = "your_pexels_api_key"
ELEVENLABS_VOICE_ID = "your_elevenlabs_voice_id" # e.g., "Rachel" or a custom ID
RUNWAY_API_KEY = "your_runwayml_api_key"
```
* **Environment Variables (for local development):**
Set the environment variables directly in your terminal or `.env` file (using a library like `python-dotenv` which is not included by default). The application will look for these if Streamlit secrets are not found.
```bash
export GEMINI_API_KEY="your_gemini_api_key"
export OPENAI_API_KEY="your_openai_api_key"
# ... and so on for other keys
```
5. **Font:**
Ensure the font file specified in `core/visual_engine.py` (e.g., `arial.ttf`) is accessible. The script tries common system paths, but you can place it in `assets/fonts/` and adjust the path in `visual_engine.py` if needed. If using Docker, ensure the font is copied into the image (see `Dockerfile`).
6. **RunwayML Implementation (Important):**
The current integration for RunwayML in `core/visual_engine.py` (method `_generate_video_clip_with_runwayml`) is a **placeholder**. You will need to:
* Install the official RunwayML SDK if available.
* Implement the actual API calls to RunwayML for text-to-video or image-to-video generation.
* The placeholder currently creates a dummy video clip using MoviePy.
## Running the Application
1. **Local Development:**
```bash
streamlit run app.py
```
The application should open in your web browser.
2. **Using Docker (Optional):**
* Build the Docker image:
```bash
docker build -t cinegen-ai .
```
* Run the Docker container (ensure API keys are passed as environment variables or handled via mounted secrets if not baked into the image for local testing):
```bash
docker run -p 8501:8501 \
-e GEMINI_API_KEY="your_key" \
-e OPENAI_API_KEY="your_key" \
# ... other env vars ...
cinegen-ai
```
Access the app at `http://localhost:8501`.
## How to Use
1. **Input Creative Seed:** Provide your core story idea, select a genre, mood, number of scenes, and AI director style in the sidebar.
2. **Generate Treatment:** Click "🌌 Generate Cinematic Treatment". The AI will produce a multi-scene breakdown.
3. **Review & Refine:**
* Examine each scene's details, including AI-generated visuals (or placeholders).
* Use the "✏️ Edit Scene X Treatment" and "🎨 Edit Scene X Visual Prompt" popovers to provide feedback and regenerate specific parts of the treatment or visuals.
* Adjust per-scene "Dominant Shot Type" and "Scene Duration" for the animatic.
4. **Fine-Tuning (Sidebar):**
* Define characters with visual descriptions.
* Apply global style overrides.
* Set narrator voice ID and narration style.
5. **Assemble Animatic:** Once you're satisfied with the treatment, visuals, and narration script (generated automatically), click "🎬 Assemble Narrated Cinematic Animatic".
6. **View & Download:** The generated animatic video will appear, and you can download it.
## Key Technologies
* **Python**
* **Streamlit:** Web application framework.
* **Google Gemini API:** For core text generation (treatment, narration script, prompt refinement).
* **OpenAI API (DALL-E 3):** For AI image generation.
* **ElevenLabs API:** For text-to-speech narration.
* **Pexels API:** For stock image/video fallbacks.
* **RunwayML API (Placeholder):** For AI video clip generation.
* **MoviePy:** For video processing and animatic assembly.
* **Pillow (PIL):** For image manipulation.
## Future Enhancements / To-Do
* Implement full, robust RunwayML API integration.
* Option to upload custom seed images for image-to-video generation.
* More sophisticated control over Ken Burns effect (pan direction, zoom intensity).
* Allow users to upload their own audio for narration or background music.
* Advanced shot list generation and export.
* Integration with other AI video/image models.
* User accounts and project saving.
* More granular error handling and user feedback in the UI.
* Refine JSON cleaning from Gemini to be even more robust.
## Contributing
Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request.
## License
This project is currently under [Specify License Here - e.g., MIT, Apache 2.0, or state as private/proprietary].
---
*This README provides a comprehensive overview. Ensure all paths, commands, and API key instructions match your specific project setup.*