---
title: CCA
emoji: 🐨
colorFrom: purple
colorTo: yellow
sdk: docker
pinned: false
---
# Multi-Model API Space

This Hugging Face Space provides API access to multiple Lyon28 models with both REST API and web interface.

## Available Models

- `Lyon28/Tinny-Llama` - Small language model
- `Lyon28/Pythia` - Pythia-based model
- `Lyon28/Bert-Tinny` - BERT variant
- `Lyon28/Albert-Base-V2` - ALBERT model
- `Lyon28/T5-Small` - T5 text-to-text model
- `Lyon28/GPT-2` - GPT-2 variant
- `Lyon28/GPT-Neo` - GPT-Neo model
- `Lyon28/Distilbert-Base-Uncased` - DistilBERT model
- `Lyon28/Distil_GPT-2` - Distilled GPT-2
- `Lyon28/GPT-2-Tinny` - Tiny GPT-2
- `Lyon28/Electra-Small` - ELECTRA model

## Features

### 🚀 REST API Endpoints

1. **GET /api/models** - List available and loaded models
2. **POST /api/load_model** - Load a specific model
3. **POST /api/generate** - Generate text using loaded models
4. **GET /health** - Health check

### 🎯 Web Interface

- Model management interface
- Interactive text generation
- Parameter tuning (temperature, top_p, max_length)
- Real-time model loading status

## API Usage

### Load a Model
```bash
curl -X POST "https://your-space-url/api/load_model" \
  -H "Content-Type: application/json" \
  -d '{"model_name": "Lyon28/GPT-2"}'
```

### Generate Text
```bash
curl -X POST "https://your-space-url/api/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "Lyon28/GPT-2",
    "prompt": "Hello world",
    "max_length": 100,
    "temperature": 0.7,
    "top_p": 0.9
  }'
```

### Python Example
```python
import requests

# Load model
response = requests.post("https://your-space-url/api/load_model", 
                        json={"model_name": "Lyon28/GPT-2"})

# Generate text
response = requests.post("https://your-space-url/api/generate", 
                        json={
                            "model_name": "Lyon28/GPT-2",
                            "prompt": "The future of AI is",
                            "max_length": 150,
                            "temperature": 0.8
                        })

result = response.json()
print(result["generated_text"])
```

## Model Types

- **Causal LM**: GPT-2, GPT-Neo, Pythia, Tinny-Llama variants
- **Text-to-Text**: T5-Small
- **Feature Extraction**: BERT, ALBERT, DistilBERT, ELECTRA

## Performance Notes

- Models are loaded on-demand to optimize memory usage
- GPU acceleration used when available
- Models cached for faster subsequent loads
- Support for both CPU and GPU inference

## Rate Limits

This is a free public API. Please use responsibly:
- Max 100 requests per minute per IP
- Max 500 tokens per generation
- Models auto-unload after 1 hour of inactivity

## Support

For issues or questions about specific Lyon28 models, please contact the model authors.

---

*Powered by Hugging Face Transformers and Gradio*