llama7bserver / README.md
vykanand's picture
Update SDK to docker and add Dockerfile for Hugging Face Spaces
e015dcb
metadata
title: LLaMA 7B Server
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: 1.0.0
app_file: app.py
pinned: false

LLaMA 7B Server

A FastAPI-based server for interacting with the LLaMA 7B model.

Features

  • Text generation
  • Model parameters configuration
  • REST API interface

API Usage

Make a POST request to /generate with the following JSON body:

{
    "prompt": "your prompt here",
    "max_length": 2048,
    "num_beams": 3,
    "early_stopping": true,
    "no_repeat_ngram_size": 3
}

Example using curl:

curl -X POST http://localhost:7860/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Hello, how are you?"}'

Example using Python:

import requests

url = "http://localhost:7860/generate"
data = {
    "prompt": "Hello, how are you?",
    "max_length": 2048,
    "num_beams": 3,
    "early_stopping": True,
    "no_repeat_ngram_size": 3
}

response = requests.post(url, json=data)
result = response.json()
print(result["generated_text"])  # This will contain your generated text

Model Details

  • Model: LLaMA 7B
  • Parameters: 7 billion
  • Language: Multilingual

Technical Details

  • Framework: Gradio
  • Python Version: 3.9+
  • Dependencies: See requirements.txt