|
--- |
|
title: CineeeeAi |
|
emoji: π |
|
colorFrom: red |
|
colorTo: red |
|
sdk: docker |
|
app_port: 8501 |
|
tags: |
|
- streamlit |
|
pinned: false |
|
short_description: Streamlit template space |
|
--- |
|
|
|
# CineGen AI Ultra+ π¬β¨ |
|
|
|
**Visionary Cinematic Pre-Production Powered by AI** |
|
|
|
CineGen AI Ultra+ is a Streamlit web application designed to accelerate the creative pre-production process for films, animations, and other visual storytelling projects. It leverages the power of Large Language Models (Google's Gemini) and other AI tools to transform a simple story idea into a rich cinematic treatment, complete with scene breakdowns, visual style suggestions, AI-generated concept art/video clips, and a narrated animatic. |
|
|
|
## Features |
|
|
|
* **AI Creative Director:** Input a core story idea, genre, and mood. |
|
* **Cinematic Treatment Generation:** |
|
* Gemini generates a detailed multi-scene treatment. |
|
* Each scene includes: |
|
* Title, Emotional Beat, Setting Description |
|
* Characters Involved, Character Focus Moment |
|
* Key Plot Beat, Suggested Dialogue Hook |
|
* **Proactive Director's Suggestions (κ°λ
- Gamdok/Director):** Visual Style, Camera Work, Sound Design. |
|
* **Asset Generation Aids:** Suggested Asset Type (Image/Video Clip), Video Motion Description & Duration, Image/Video Keywords, Pexels Search Queries. |
|
* **Visual Asset Generation:** |
|
* **Image Generation:** Utilizes DALL-E 3 (via OpenAI API) to generate concept art for scenes based on derived prompts. |
|
* **Stock Footage Fallback:** Uses Pexels API for relevant stock images/videos if AI generation is disabled or fails. |
|
* **Video Clip Generation (Placeholder):** Integrated structure for text-to-video/image-to-video generation using RunwayML API (requires user to implement actual API calls in `core/visual_engine.py`). Placeholder generates dummy video clips. |
|
* **Character Definition:** Define key characters with visual descriptions for more consistent AI-generated visuals. |
|
* **Global Style Overrides:** Apply global stylistic keywords (e.g., "Hyper-Realistic Gritty Noir," "Vintage Analog Sci-Fi") to influence visual generation. |
|
* **AI-Powered Narration:** |
|
* Gemini crafts a narration script based on the generated treatment. |
|
* ElevenLabs API synthesizes the narration into natural-sounding audio. |
|
* Customizable voice ID and narration style. |
|
* **Iterative Refinement:** |
|
* Edit scene treatments and regenerate them with AI assistance. |
|
* Refine DALL-E prompts based on feedback and regenerate visuals. |
|
* **Cinematic Animatic Assembly:** |
|
* Combines generated visual assets (images/video clips) and the synthesized narration into a downloadable `.mp4` animatic. |
|
* Customizable per-scene duration for pacing control. |
|
* Ken Burns effect on still images and text overlays for scene context. |
|
* **Secrets Management:** Securely loads API keys from Streamlit secrets or environment variables. |
|
|
|
## Project Structure |
|
Use code with caution. |
|
Markdown |
|
CineGenAI/ |
|
βββ .streamlit/ |
|
β βββ secrets.toml # API Keys and configuration (DO NOT COMMIT if public) |
|
βββ assets/ |
|
β βββ fonts/ |
|
β βββ arial.ttf # Example font file (ensure it's available or update path) |
|
βββ core/ |
|
β βββ init.py |
|
β βββ gemini_handler.py # Manages interactions with the Gemini API |
|
β βββ visual_engine.py # Handles image/video generation (DALL-E, Pexels, RunwayML placeholder) and video assembly |
|
β βββ prompt_engineering.py # Contains functions to craft detailed prompts for Gemini |
|
βββ temp_cinegen_media/ # Temporary directory for generated media (add to .gitignore) |
|
βββ app.py # Main Streamlit application script |
|
βββ Dockerfile # For containerizing the application |
|
βββ Dockerfile.test # (Optional) For testing |
|
βββ requirements.txt # Python dependencies |
|
βββ README.md # This file |
|
βββ .gitattributes # For Git LFS if handling large font files |
|
## Setup and Installation |
|
|
|
1. **Clone the Repository:** |
|
```bash |
|
git clone <repository_url> |
|
cd CineGenAI |
|
``` |
|
|
|
2. **Create a Virtual Environment (Recommended):** |
|
```bash |
|
python -m venv venv |
|
source venv/bin/activate # On Windows: venv\Scripts\activate |
|
``` |
|
|
|
3. **Install Dependencies:** |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
*Note: `MoviePy` might require `ffmpeg` to be installed on your system. On Debian/Ubuntu: `sudo apt-get install ffmpeg`. On macOS with Homebrew: `brew install ffmpeg`.* |
|
|
|
4. **Set Up API Keys:** |
|
You need API keys for the following services: |
|
* Google Gemini API |
|
* OpenAI API (for DALL-E) |
|
* ElevenLabs API (and optionally a specific Voice ID) |
|
* Pexels API |
|
* RunwayML API (if implementing full video generation) |
|
|
|
Store these keys securely. You have two primary options: |
|
|
|
* **Streamlit Secrets (Recommended for Hugging Face Spaces / Streamlit Cloud):** |
|
Create a file `.streamlit/secrets.toml` (make sure this file is in your `.gitignore` if your repository is public!) with the following format: |
|
```toml |
|
GEMINI_API_KEY = "your_gemini_api_key" |
|
OPENAI_API_KEY = "your_openai_api_key" |
|
ELEVENLABS_API_KEY = "your_elevenlabs_api_key" |
|
PEXELS_API_KEY = "your_pexels_api_key" |
|
ELEVENLABS_VOICE_ID = "your_elevenlabs_voice_id" # e.g., "Rachel" or a custom ID |
|
RUNWAY_API_KEY = "your_runwayml_api_key" |
|
``` |
|
|
|
* **Environment Variables (for local development):** |
|
Set the environment variables directly in your terminal or `.env` file (using a library like `python-dotenv` which is not included by default). The application will look for these if Streamlit secrets are not found. |
|
```bash |
|
export GEMINI_API_KEY="your_gemini_api_key" |
|
export OPENAI_API_KEY="your_openai_api_key" |
|
# ... and so on for other keys |
|
``` |
|
|
|
5. **Font:** |
|
Ensure the font file specified in `core/visual_engine.py` (e.g., `arial.ttf`) is accessible. The script tries common system paths, but you can place it in `assets/fonts/` and adjust the path in `visual_engine.py` if needed. If using Docker, ensure the font is copied into the image (see `Dockerfile`). |
|
|
|
6. **RunwayML Implementation (Important):** |
|
The current integration for RunwayML in `core/visual_engine.py` (method `_generate_video_clip_with_runwayml`) is a **placeholder**. You will need to: |
|
* Install the official RunwayML SDK if available. |
|
* Implement the actual API calls to RunwayML for text-to-video or image-to-video generation. |
|
* The placeholder currently creates a dummy video clip using MoviePy. |
|
|
|
## Running the Application |
|
|
|
1. **Local Development:** |
|
```bash |
|
streamlit run app.py |
|
``` |
|
The application should open in your web browser. |
|
|
|
2. **Using Docker (Optional):** |
|
* Build the Docker image: |
|
```bash |
|
docker build -t cinegen-ai . |
|
``` |
|
* Run the Docker container (ensure API keys are passed as environment variables or handled via mounted secrets if not baked into the image for local testing): |
|
```bash |
|
docker run -p 8501:8501 \ |
|
-e GEMINI_API_KEY="your_key" \ |
|
-e OPENAI_API_KEY="your_key" \ |
|
# ... other env vars ... |
|
cinegen-ai |
|
``` |
|
Access the app at `http://localhost:8501`. |
|
|
|
## How to Use |
|
|
|
1. **Input Creative Seed:** Provide your core story idea, select a genre, mood, number of scenes, and AI director style in the sidebar. |
|
2. **Generate Treatment:** Click "π Generate Cinematic Treatment". The AI will produce a multi-scene breakdown. |
|
3. **Review & Refine:** |
|
* Examine each scene's details, including AI-generated visuals (or placeholders). |
|
* Use the "βοΈ Edit Scene X Treatment" and "π¨ Edit Scene X Visual Prompt" popovers to provide feedback and regenerate specific parts of the treatment or visuals. |
|
* Adjust per-scene "Dominant Shot Type" and "Scene Duration" for the animatic. |
|
4. **Fine-Tuning (Sidebar):** |
|
* Define characters with visual descriptions. |
|
* Apply global style overrides. |
|
* Set narrator voice ID and narration style. |
|
5. **Assemble Animatic:** Once you're satisfied with the treatment, visuals, and narration script (generated automatically), click "π¬ Assemble Narrated Cinematic Animatic". |
|
6. **View & Download:** The generated animatic video will appear, and you can download it. |
|
|
|
## Key Technologies |
|
|
|
* **Python** |
|
* **Streamlit:** Web application framework. |
|
* **Google Gemini API:** For core text generation (treatment, narration script, prompt refinement). |
|
* **OpenAI API (DALL-E 3):** For AI image generation. |
|
* **ElevenLabs API:** For text-to-speech narration. |
|
* **Pexels API:** For stock image/video fallbacks. |
|
* **RunwayML API (Placeholder):** For AI video clip generation. |
|
* **MoviePy:** For video processing and animatic assembly. |
|
* **Pillow (PIL):** For image manipulation. |
|
|
|
## Future Enhancements / To-Do |
|
|
|
* Implement full, robust RunwayML API integration. |
|
* Option to upload custom seed images for image-to-video generation. |
|
* More sophisticated control over Ken Burns effect (pan direction, zoom intensity). |
|
* Allow users to upload their own audio for narration or background music. |
|
* Advanced shot list generation and export. |
|
* Integration with other AI video/image models. |
|
* User accounts and project saving. |
|
* More granular error handling and user feedback in the UI. |
|
* Refine JSON cleaning from Gemini to be even more robust. |
|
|
|
## Contributing |
|
|
|
Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request. |
|
|
|
## License |
|
|
|
This project is currently under [Specify License Here - e.g., MIT, Apache 2.0, or state as private/proprietary]. |
|
|
|
--- |
|
|
|
*This README provides a comprehensive overview. Ensure all paths, commands, and API key instructions match your specific project setup.* |