File size: 9,962 Bytes
5afbe18 93fb552 5afbe18 93fb552 5afbe18 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca 8f49794 17602ca |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 |
---
title: CineeeeAi
emoji: π
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
---
# CineGen AI Ultra+ π¬β¨
**Visionary Cinematic Pre-Production Powered by AI**
CineGen AI Ultra+ is a Streamlit web application designed to accelerate the creative pre-production process for films, animations, and other visual storytelling projects. It leverages the power of Large Language Models (Google's Gemini) and other AI tools to transform a simple story idea into a rich cinematic treatment, complete with scene breakdowns, visual style suggestions, AI-generated concept art/video clips, and a narrated animatic.
## Features
* **AI Creative Director:** Input a core story idea, genre, and mood.
* **Cinematic Treatment Generation:**
* Gemini generates a detailed multi-scene treatment.
* Each scene includes:
* Title, Emotional Beat, Setting Description
* Characters Involved, Character Focus Moment
* Key Plot Beat, Suggested Dialogue Hook
* **Proactive Director's Suggestions (κ°λ
- Gamdok/Director):** Visual Style, Camera Work, Sound Design.
* **Asset Generation Aids:** Suggested Asset Type (Image/Video Clip), Video Motion Description & Duration, Image/Video Keywords, Pexels Search Queries.
* **Visual Asset Generation:**
* **Image Generation:** Utilizes DALL-E 3 (via OpenAI API) to generate concept art for scenes based on derived prompts.
* **Stock Footage Fallback:** Uses Pexels API for relevant stock images/videos if AI generation is disabled or fails.
* **Video Clip Generation (Placeholder):** Integrated structure for text-to-video/image-to-video generation using RunwayML API (requires user to implement actual API calls in `core/visual_engine.py`). Placeholder generates dummy video clips.
* **Character Definition:** Define key characters with visual descriptions for more consistent AI-generated visuals.
* **Global Style Overrides:** Apply global stylistic keywords (e.g., "Hyper-Realistic Gritty Noir," "Vintage Analog Sci-Fi") to influence visual generation.
* **AI-Powered Narration:**
* Gemini crafts a narration script based on the generated treatment.
* ElevenLabs API synthesizes the narration into natural-sounding audio.
* Customizable voice ID and narration style.
* **Iterative Refinement:**
* Edit scene treatments and regenerate them with AI assistance.
* Refine DALL-E prompts based on feedback and regenerate visuals.
* **Cinematic Animatic Assembly:**
* Combines generated visual assets (images/video clips) and the synthesized narration into a downloadable `.mp4` animatic.
* Customizable per-scene duration for pacing control.
* Ken Burns effect on still images and text overlays for scene context.
* **Secrets Management:** Securely loads API keys from Streamlit secrets or environment variables.
## Project Structure
Use code with caution.
Markdown
CineGenAI/
βββ .streamlit/
β βββ secrets.toml # API Keys and configuration (DO NOT COMMIT if public)
βββ assets/
β βββ fonts/
β βββ arial.ttf # Example font file (ensure it's available or update path)
βββ core/
β βββ init.py
β βββ gemini_handler.py # Manages interactions with the Gemini API
β βββ visual_engine.py # Handles image/video generation (DALL-E, Pexels, RunwayML placeholder) and video assembly
β βββ prompt_engineering.py # Contains functions to craft detailed prompts for Gemini
βββ temp_cinegen_media/ # Temporary directory for generated media (add to .gitignore)
βββ app.py # Main Streamlit application script
βββ Dockerfile # For containerizing the application
βββ Dockerfile.test # (Optional) For testing
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ .gitattributes # For Git LFS if handling large font files
## Setup and Installation
1. **Clone the Repository:**
```bash
git clone <repository_url>
cd CineGenAI
```
2. **Create a Virtual Environment (Recommended):**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. **Install Dependencies:**
```bash
pip install -r requirements.txt
```
*Note: `MoviePy` might require `ffmpeg` to be installed on your system. On Debian/Ubuntu: `sudo apt-get install ffmpeg`. On macOS with Homebrew: `brew install ffmpeg`.*
4. **Set Up API Keys:**
You need API keys for the following services:
* Google Gemini API
* OpenAI API (for DALL-E)
* ElevenLabs API (and optionally a specific Voice ID)
* Pexels API
* RunwayML API (if implementing full video generation)
Store these keys securely. You have two primary options:
* **Streamlit Secrets (Recommended for Hugging Face Spaces / Streamlit Cloud):**
Create a file `.streamlit/secrets.toml` (make sure this file is in your `.gitignore` if your repository is public!) with the following format:
```toml
GEMINI_API_KEY = "your_gemini_api_key"
OPENAI_API_KEY = "your_openai_api_key"
ELEVENLABS_API_KEY = "your_elevenlabs_api_key"
PEXELS_API_KEY = "your_pexels_api_key"
ELEVENLABS_VOICE_ID = "your_elevenlabs_voice_id" # e.g., "Rachel" or a custom ID
RUNWAY_API_KEY = "your_runwayml_api_key"
```
* **Environment Variables (for local development):**
Set the environment variables directly in your terminal or `.env` file (using a library like `python-dotenv` which is not included by default). The application will look for these if Streamlit secrets are not found.
```bash
export GEMINI_API_KEY="your_gemini_api_key"
export OPENAI_API_KEY="your_openai_api_key"
# ... and so on for other keys
```
5. **Font:**
Ensure the font file specified in `core/visual_engine.py` (e.g., `arial.ttf`) is accessible. The script tries common system paths, but you can place it in `assets/fonts/` and adjust the path in `visual_engine.py` if needed. If using Docker, ensure the font is copied into the image (see `Dockerfile`).
6. **RunwayML Implementation (Important):**
The current integration for RunwayML in `core/visual_engine.py` (method `_generate_video_clip_with_runwayml`) is a **placeholder**. You will need to:
* Install the official RunwayML SDK if available.
* Implement the actual API calls to RunwayML for text-to-video or image-to-video generation.
* The placeholder currently creates a dummy video clip using MoviePy.
## Running the Application
1. **Local Development:**
```bash
streamlit run app.py
```
The application should open in your web browser.
2. **Using Docker (Optional):**
* Build the Docker image:
```bash
docker build -t cinegen-ai .
```
* Run the Docker container (ensure API keys are passed as environment variables or handled via mounted secrets if not baked into the image for local testing):
```bash
docker run -p 8501:8501 \
-e GEMINI_API_KEY="your_key" \
-e OPENAI_API_KEY="your_key" \
# ... other env vars ...
cinegen-ai
```
Access the app at `http://localhost:8501`.
## How to Use
1. **Input Creative Seed:** Provide your core story idea, select a genre, mood, number of scenes, and AI director style in the sidebar.
2. **Generate Treatment:** Click "π Generate Cinematic Treatment". The AI will produce a multi-scene breakdown.
3. **Review & Refine:**
* Examine each scene's details, including AI-generated visuals (or placeholders).
* Use the "βοΈ Edit Scene X Treatment" and "π¨ Edit Scene X Visual Prompt" popovers to provide feedback and regenerate specific parts of the treatment or visuals.
* Adjust per-scene "Dominant Shot Type" and "Scene Duration" for the animatic.
4. **Fine-Tuning (Sidebar):**
* Define characters with visual descriptions.
* Apply global style overrides.
* Set narrator voice ID and narration style.
5. **Assemble Animatic:** Once you're satisfied with the treatment, visuals, and narration script (generated automatically), click "π¬ Assemble Narrated Cinematic Animatic".
6. **View & Download:** The generated animatic video will appear, and you can download it.
## Key Technologies
* **Python**
* **Streamlit:** Web application framework.
* **Google Gemini API:** For core text generation (treatment, narration script, prompt refinement).
* **OpenAI API (DALL-E 3):** For AI image generation.
* **ElevenLabs API:** For text-to-speech narration.
* **Pexels API:** For stock image/video fallbacks.
* **RunwayML API (Placeholder):** For AI video clip generation.
* **MoviePy:** For video processing and animatic assembly.
* **Pillow (PIL):** For image manipulation.
## Future Enhancements / To-Do
* Implement full, robust RunwayML API integration.
* Option to upload custom seed images for image-to-video generation.
* More sophisticated control over Ken Burns effect (pan direction, zoom intensity).
* Allow users to upload their own audio for narration or background music.
* Advanced shot list generation and export.
* Integration with other AI video/image models.
* User accounts and project saving.
* More granular error handling and user feedback in the UI.
* Refine JSON cleaning from Gemini to be even more robust.
## Contributing
Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request.
## License
This project is currently under [Specify License Here - e.g., MIT, Apache 2.0, or state as private/proprietary].
---
*This README provides a comprehensive overview. Ensure all paths, commands, and API key instructions match your specific project setup.* |