CCA / README.md
Lyon28's picture
Update README.md
0a06081 verified
metadata
title: CCA
emoji: 🐨
colorFrom: purple
colorTo: yellow
sdk: docker
pinned: false

Multi-Model API Space

This Hugging Face Space provides API access to multiple Lyon28 models with both REST API and web interface.

Available Models

  • Lyon28/Tinny-Llama - Small language model
  • Lyon28/Pythia - Pythia-based model
  • Lyon28/Bert-Tinny - BERT variant
  • Lyon28/Albert-Base-V2 - ALBERT model
  • Lyon28/T5-Small - T5 text-to-text model
  • Lyon28/GPT-2 - GPT-2 variant
  • Lyon28/GPT-Neo - GPT-Neo model
  • Lyon28/Distilbert-Base-Uncased - DistilBERT model
  • Lyon28/Distil_GPT-2 - Distilled GPT-2
  • Lyon28/GPT-2-Tinny - Tiny GPT-2
  • Lyon28/Electra-Small - ELECTRA model

Features

πŸš€ REST API Endpoints

  1. GET /api/models - List available and loaded models
  2. POST /api/load_model - Load a specific model
  3. POST /api/generate - Generate text using loaded models
  4. GET /health - Health check

🎯 Web Interface

  • Model management interface
  • Interactive text generation
  • Parameter tuning (temperature, top_p, max_length)
  • Real-time model loading status

API Usage

Load a Model

curl -X POST "https://your-space-url/api/load_model" \
  -H "Content-Type: application/json" \
  -d '{"model_name": "Lyon28/GPT-2"}'

Generate Text

curl -X POST "https://your-space-url/api/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "Lyon28/GPT-2",
    "prompt": "Hello world",
    "max_length": 100,
    "temperature": 0.7,
    "top_p": 0.9
  }'

Python Example

import requests

# Load model
response = requests.post("https://your-space-url/api/load_model", 
                        json={"model_name": "Lyon28/GPT-2"})

# Generate text
response = requests.post("https://your-space-url/api/generate", 
                        json={
                            "model_name": "Lyon28/GPT-2",
                            "prompt": "The future of AI is",
                            "max_length": 150,
                            "temperature": 0.8
                        })

result = response.json()
print(result["generated_text"])

Model Types

  • Causal LM: GPT-2, GPT-Neo, Pythia, Tinny-Llama variants
  • Text-to-Text: T5-Small
  • Feature Extraction: BERT, ALBERT, DistilBERT, ELECTRA

Performance Notes

  • Models are loaded on-demand to optimize memory usage
  • GPU acceleration used when available
  • Models cached for faster subsequent loads
  • Support for both CPU and GPU inference

Rate Limits

This is a free public API. Please use responsibly:

  • Max 100 requests per minute per IP
  • Max 500 tokens per generation
  • Models auto-unload after 1 hour of inactivity

Support

For issues or questions about specific Lyon28 models, please contact the model authors.


Powered by Hugging Face Transformers and Gradio