|
--- |
|
title: CCA |
|
emoji: π¨ |
|
colorFrom: purple |
|
colorTo: yellow |
|
sdk: docker |
|
pinned: false |
|
--- |
|
# Multi-Model API Space |
|
|
|
This Hugging Face Space provides API access to multiple Lyon28 models with both REST API and web interface. |
|
|
|
## Available Models |
|
|
|
- `Lyon28/Tinny-Llama` - Small language model |
|
- `Lyon28/Pythia` - Pythia-based model |
|
- `Lyon28/Bert-Tinny` - BERT variant |
|
- `Lyon28/Albert-Base-V2` - ALBERT model |
|
- `Lyon28/T5-Small` - T5 text-to-text model |
|
- `Lyon28/GPT-2` - GPT-2 variant |
|
- `Lyon28/GPT-Neo` - GPT-Neo model |
|
- `Lyon28/Distilbert-Base-Uncased` - DistilBERT model |
|
- `Lyon28/Distil_GPT-2` - Distilled GPT-2 |
|
- `Lyon28/GPT-2-Tinny` - Tiny GPT-2 |
|
- `Lyon28/Electra-Small` - ELECTRA model |
|
|
|
## Features |
|
|
|
### π REST API Endpoints |
|
|
|
1. **GET /api/models** - List available and loaded models |
|
2. **POST /api/load_model** - Load a specific model |
|
3. **POST /api/generate** - Generate text using loaded models |
|
4. **GET /health** - Health check |
|
|
|
### π― Web Interface |
|
|
|
- Model management interface |
|
- Interactive text generation |
|
- Parameter tuning (temperature, top_p, max_length) |
|
- Real-time model loading status |
|
|
|
## API Usage |
|
|
|
### Load a Model |
|
```bash |
|
curl -X POST "https://your-space-url/api/load_model" \ |
|
-H "Content-Type: application/json" \ |
|
-d '{"model_name": "Lyon28/GPT-2"}' |
|
``` |
|
|
|
### Generate Text |
|
```bash |
|
curl -X POST "https://your-space-url/api/generate" \ |
|
-H "Content-Type: application/json" \ |
|
-d '{ |
|
"model_name": "Lyon28/GPT-2", |
|
"prompt": "Hello world", |
|
"max_length": 100, |
|
"temperature": 0.7, |
|
"top_p": 0.9 |
|
}' |
|
``` |
|
|
|
### Python Example |
|
```python |
|
import requests |
|
|
|
# Load model |
|
response = requests.post("https://your-space-url/api/load_model", |
|
json={"model_name": "Lyon28/GPT-2"}) |
|
|
|
# Generate text |
|
response = requests.post("https://your-space-url/api/generate", |
|
json={ |
|
"model_name": "Lyon28/GPT-2", |
|
"prompt": "The future of AI is", |
|
"max_length": 150, |
|
"temperature": 0.8 |
|
}) |
|
|
|
result = response.json() |
|
print(result["generated_text"]) |
|
``` |
|
|
|
## Model Types |
|
|
|
- **Causal LM**: GPT-2, GPT-Neo, Pythia, Tinny-Llama variants |
|
- **Text-to-Text**: T5-Small |
|
- **Feature Extraction**: BERT, ALBERT, DistilBERT, ELECTRA |
|
|
|
## Performance Notes |
|
|
|
- Models are loaded on-demand to optimize memory usage |
|
- GPU acceleration used when available |
|
- Models cached for faster subsequent loads |
|
- Support for both CPU and GPU inference |
|
|
|
## Rate Limits |
|
|
|
This is a free public API. Please use responsibly: |
|
- Max 100 requests per minute per IP |
|
- Max 500 tokens per generation |
|
- Models auto-unload after 1 hour of inactivity |
|
|
|
## Support |
|
|
|
For issues or questions about specific Lyon28 models, please contact the model authors. |
|
|
|
--- |
|
|
|
*Powered by Hugging Face Transformers and Gradio* |