File size: 9,962 Bytes
5afbe18
93fb552
5afbe18
 
 
 
 
 
93fb552
5afbe18
 
 
 
17602ca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8f49794
 
 
17602ca
 
8f49794
 
17602ca
8f49794
 
 
 
 
 
 
 
 
17602ca
8f49794
17602ca
 
 
 
 
 
 
 
 
 
 
 
8f49794
17602ca
 
 
 
 
 
8f49794
 
17602ca
 
 
 
 
 
 
8f49794
17602ca
 
8f49794
17602ca
 
 
 
 
8f49794
17602ca
 
 
8f49794
17602ca
8f49794
17602ca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8f49794
 
17602ca
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
---
title: CineeeeAi
emoji: πŸš€
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
---

# CineGen AI Ultra+ 🎬✨

**Visionary Cinematic Pre-Production Powered by AI**

CineGen AI Ultra+ is a Streamlit web application designed to accelerate the creative pre-production process for films, animations, and other visual storytelling projects. It leverages the power of Large Language Models (Google's Gemini) and other AI tools to transform a simple story idea into a rich cinematic treatment, complete with scene breakdowns, visual style suggestions, AI-generated concept art/video clips, and a narrated animatic.

## Features

*   **AI Creative Director:** Input a core story idea, genre, and mood.
*   **Cinematic Treatment Generation:**
    *   Gemini generates a detailed multi-scene treatment.
    *   Each scene includes:
        *   Title, Emotional Beat, Setting Description
        *   Characters Involved, Character Focus Moment
        *   Key Plot Beat, Suggested Dialogue Hook
        *   **Proactive Director's Suggestions (감독 - Gamdok/Director):** Visual Style, Camera Work, Sound Design.
        *   **Asset Generation Aids:** Suggested Asset Type (Image/Video Clip), Video Motion Description & Duration, Image/Video Keywords, Pexels Search Queries.
*   **Visual Asset Generation:**
    *   **Image Generation:** Utilizes DALL-E 3 (via OpenAI API) to generate concept art for scenes based on derived prompts.
    *   **Stock Footage Fallback:** Uses Pexels API for relevant stock images/videos if AI generation is disabled or fails.
    *   **Video Clip Generation (Placeholder):** Integrated structure for text-to-video/image-to-video generation using RunwayML API (requires user to implement actual API calls in `core/visual_engine.py`). Placeholder generates dummy video clips.
*   **Character Definition:** Define key characters with visual descriptions for more consistent AI-generated visuals.
*   **Global Style Overrides:** Apply global stylistic keywords (e.g., "Hyper-Realistic Gritty Noir," "Vintage Analog Sci-Fi") to influence visual generation.
*   **AI-Powered Narration:**
    *   Gemini crafts a narration script based on the generated treatment.
    *   ElevenLabs API synthesizes the narration into natural-sounding audio.
    *   Customizable voice ID and narration style.
*   **Iterative Refinement:**
    *   Edit scene treatments and regenerate them with AI assistance.
    *   Refine DALL-E prompts based on feedback and regenerate visuals.
*   **Cinematic Animatic Assembly:**
    *   Combines generated visual assets (images/video clips) and the synthesized narration into a downloadable `.mp4` animatic.
    *   Customizable per-scene duration for pacing control.
    *   Ken Burns effect on still images and text overlays for scene context.
*   **Secrets Management:** Securely loads API keys from Streamlit secrets or environment variables.

## Project Structure
Use code with caution.
Markdown
CineGenAI/
β”œβ”€β”€ .streamlit/
β”‚ └── secrets.toml # API Keys and configuration (DO NOT COMMIT if public)
β”œβ”€β”€ assets/
β”‚ └── fonts/
β”‚ └── arial.ttf # Example font file (ensure it's available or update path)
β”œβ”€β”€ core/
β”‚ β”œβ”€β”€ init.py
β”‚ β”œβ”€β”€ gemini_handler.py # Manages interactions with the Gemini API
β”‚ β”œβ”€β”€ visual_engine.py # Handles image/video generation (DALL-E, Pexels, RunwayML placeholder) and video assembly
β”‚ └── prompt_engineering.py # Contains functions to craft detailed prompts for Gemini
β”œβ”€β”€ temp_cinegen_media/ # Temporary directory for generated media (add to .gitignore)
β”œβ”€β”€ app.py # Main Streamlit application script
β”œβ”€β”€ Dockerfile # For containerizing the application
β”œβ”€β”€ Dockerfile.test # (Optional) For testing
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # This file
└── .gitattributes # For Git LFS if handling large font files
## Setup and Installation

1.  **Clone the Repository:**
    ```bash
    git clone <repository_url>
    cd CineGenAI
    ```

2.  **Create a Virtual Environment (Recommended):**
    ```bash
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    ```

3.  **Install Dependencies:**
    ```bash
    pip install -r requirements.txt
    ```
    *Note: `MoviePy` might require `ffmpeg` to be installed on your system. On Debian/Ubuntu: `sudo apt-get install ffmpeg`. On macOS with Homebrew: `brew install ffmpeg`.*

4.  **Set Up API Keys:**
    You need API keys for the following services:
    *   Google Gemini API
    *   OpenAI API (for DALL-E)
    *   ElevenLabs API (and optionally a specific Voice ID)
    *   Pexels API
    *   RunwayML API (if implementing full video generation)

    Store these keys securely. You have two primary options:

    *   **Streamlit Secrets (Recommended for Hugging Face Spaces / Streamlit Cloud):**
        Create a file `.streamlit/secrets.toml` (make sure this file is in your `.gitignore` if your repository is public!) with the following format:
        ```toml
        GEMINI_API_KEY = "your_gemini_api_key"
        OPENAI_API_KEY = "your_openai_api_key"
        ELEVENLABS_API_KEY = "your_elevenlabs_api_key"
        PEXELS_API_KEY = "your_pexels_api_key"
        ELEVENLABS_VOICE_ID = "your_elevenlabs_voice_id" # e.g., "Rachel" or a custom ID
        RUNWAY_API_KEY = "your_runwayml_api_key"
        ```

    *   **Environment Variables (for local development):**
        Set the environment variables directly in your terminal or `.env` file (using a library like `python-dotenv` which is not included by default). The application will look for these if Streamlit secrets are not found.
        ```bash
        export GEMINI_API_KEY="your_gemini_api_key"
        export OPENAI_API_KEY="your_openai_api_key"
        # ... and so on for other keys
        ```

5.  **Font:**
    Ensure the font file specified in `core/visual_engine.py` (e.g., `arial.ttf`) is accessible. The script tries common system paths, but you can place it in `assets/fonts/` and adjust the path in `visual_engine.py` if needed. If using Docker, ensure the font is copied into the image (see `Dockerfile`).

6.  **RunwayML Implementation (Important):**
    The current integration for RunwayML in `core/visual_engine.py` (method `_generate_video_clip_with_runwayml`) is a **placeholder**. You will need to:
    *   Install the official RunwayML SDK if available.
    *   Implement the actual API calls to RunwayML for text-to-video or image-to-video generation.
    *   The placeholder currently creates a dummy video clip using MoviePy.

## Running the Application

1.  **Local Development:**
    ```bash
    streamlit run app.py
    ```
    The application should open in your web browser.

2.  **Using Docker (Optional):**
    *   Build the Docker image:
        ```bash
        docker build -t cinegen-ai .
        ```
    *   Run the Docker container (ensure API keys are passed as environment variables or handled via mounted secrets if not baked into the image for local testing):
        ```bash
        docker run -p 8501:8501 \
          -e GEMINI_API_KEY="your_key" \
          -e OPENAI_API_KEY="your_key" \
          # ... other env vars ...
          cinegen-ai
        ```
        Access the app at `http://localhost:8501`.

## How to Use

1.  **Input Creative Seed:** Provide your core story idea, select a genre, mood, number of scenes, and AI director style in the sidebar.
2.  **Generate Treatment:** Click "🌌 Generate Cinematic Treatment". The AI will produce a multi-scene breakdown.
3.  **Review & Refine:**
    *   Examine each scene's details, including AI-generated visuals (or placeholders).
    *   Use the "✏️ Edit Scene X Treatment" and "🎨 Edit Scene X Visual Prompt" popovers to provide feedback and regenerate specific parts of the treatment or visuals.
    *   Adjust per-scene "Dominant Shot Type" and "Scene Duration" for the animatic.
4.  **Fine-Tuning (Sidebar):**
    *   Define characters with visual descriptions.
    *   Apply global style overrides.
    *   Set narrator voice ID and narration style.
5.  **Assemble Animatic:** Once you're satisfied with the treatment, visuals, and narration script (generated automatically), click "🎬 Assemble Narrated Cinematic Animatic".
6.  **View & Download:** The generated animatic video will appear, and you can download it.

## Key Technologies

*   **Python**
*   **Streamlit:** Web application framework.
*   **Google Gemini API:** For core text generation (treatment, narration script, prompt refinement).
*   **OpenAI API (DALL-E 3):** For AI image generation.
*   **ElevenLabs API:** For text-to-speech narration.
*   **Pexels API:** For stock image/video fallbacks.
*   **RunwayML API (Placeholder):** For AI video clip generation.
*   **MoviePy:** For video processing and animatic assembly.
*   **Pillow (PIL):** For image manipulation.

## Future Enhancements / To-Do

*   Implement full, robust RunwayML API integration.
*   Option to upload custom seed images for image-to-video generation.
*   More sophisticated control over Ken Burns effect (pan direction, zoom intensity).
*   Allow users to upload their own audio for narration or background music.
*   Advanced shot list generation and export.
*   Integration with other AI video/image models.
*   User accounts and project saving.
*   More granular error handling and user feedback in the UI.
*   Refine JSON cleaning from Gemini to be even more robust.

## Contributing

Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request.

## License

This project is currently under [Specify License Here - e.g., MIT, Apache 2.0, or state as private/proprietary].

---

*This README provides a comprehensive overview. Ensure all paths, commands, and API key instructions match your specific project setup.*