File size: 2,976 Bytes
b7f91f4
 
 
 
 
 
 
 
 
 
 
8234bbd
a38b4f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8234bbd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
title: Media Gen Api
emoji: πŸ“Š
colorFrom: indigo
colorTo: pink
sdk: docker
pinned: false
short_description: FastAPI backend for Text-to-Audio, Image, and Video generato
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
=======
# πŸŽ™οΈ Media Generation API

A FastAPI-based backend to generate audio, images, video, and PPT from user inputs. 
Supports BLEU/CLIP metrics, token-based authentication, and stores metadata in SQLite/Postgres.

A modular, RESTful FastAPI solution that converts text input into:
- πŸŽ₯ Video
- πŸ–ΌοΈ Image/Graphics
- πŸ”Š Audio


---

## πŸš€ Features

- Text β†’ Video: Tone, domain, and environment-aware video generation.
- Text β†’ Audio: Context-aware voice synthesis with emotional tone and language support.
- Text β†’ Graphics: Visual generation using parameter-based prompts.
- BLEU/CLIP metrics for prompt-output fidelity.
- Token-based authentication for secure API use.
- Dockerized for easy deployment
- Optional Streamlit/React UI
- Swagger UI: `http://localhost:8000/docs`

---

### πŸ“ Project Structure
media-gen-api/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/v1/               # Versioned API endpoints
β”‚   β”œβ”€β”€ auth/                 # Token-based auth
β”‚   β”œβ”€β”€ services/             # Core media generation logic
β”‚   └── main.py               # FastAPI entry point
β”œβ”€β”€ tests/                    # Unit/integration tests
β”œβ”€β”€ requirements.txt
└── README.md

---

## πŸ“¦ Installation
πŸš€ Run Locally
1. Clone repo & create virtual environment

git clone https://github.com/yourorg/media-gen-api.git
cd media-gen-api
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

2. Install dependencies

pip install -r requirements.txt

3. Run the API

uvicorn app.main:app --reload

Access docs: http://127.0.0.1:8000/docs

---
### πŸ” Authentication
Use Bearer <your_token> in the Authorize button or headers.

---
### πŸ“‘ API Endpoints Summary
| Endpoint                  | Method | Description               |
|--------------------------|--------|---------------------------|
| /api/v1/audio/generate   | POST   | Generate audio from text |
| /api/v1/image/generate   | POST   | Generate image from text |
| /api/v1/video/generate   | POST   | Generate video from text |
| /api/v1/download         | GET    | Download generated file  |

---
###πŸ“¦ Deployment (Streamlit/Optional UI)
Option 1: Run with Streamlit (for demo)
streamlit run streamlit_ui.py

Option 2: Docker (Production-ready)
docker build -t media-gen-api .
docker run -p 8000:8000 media-gen-api

---
### πŸ“Š Metrics Logging (Optional)
- BLEU score and CLIPScore (WIP)
- Latency, GPU/CPU tracking
- Log file: logs/generation.log

---
#### πŸ“‹ Submission Checklist
- βœ… RESTful modular architecture
- βœ… Multi-format (MP4, PNG, WAV)
- βœ… Token Auth + Swagger UI
- βœ… Compatible with DD/PIB via API
- βœ… Streamlit demo app (optional)