Spaces:

tommytracx
/

ollama-api

Sleeping

App Files Files Community

ollama-api / README.md

tommytracx

Upload 4 files

d2c5505 verified 27 days ago

preview code

raw

history blame contribute delete

4.32 kB

	---
	title: Ollama API Space
	emoji: 🚀
	colorFrom: blue
	colorTo: purple
	sdk: docker
	app_port: 7860
	---

	# 🚀 Ollama API Space

	A Hugging Face Space that provides a REST API interface for Ollama models, allowing you to run local LLMs through a web API.

	## 🌟 Features

	- Model Management: List and pull Ollama models
	- Text Generation: Generate text using any available Ollama model
	- REST API: Simple HTTP endpoints for easy integration
	- Health Monitoring: Built-in health checks and status monitoring
	- OpenWebUI Integration: Compatible with OpenWebUI for a full chat interface

	## 🚀 Quick Start

	### 1. Deploy to Hugging Face Spaces

	1. Fork this repository or create a new Space
	2. Upload these files to your Space
	3. No environment variables needed - Ollama runs inside the Space!
	4. Wait for the build to complete (may take 10-15 minutes due to Ollama installation)

	### 2. Local Development

	```bash
	# Clone the repository
	git clone <your-repo-url>
	cd ollama-space

	# Install dependencies
	pip install -r requirements.txt

	# Install Ollama locally
	curl -fsSL https://ollama.ai/install.sh \| sh

	# Start Ollama in another terminal
	ollama serve

	# Run the application
	python app.py
	```

	## 📡 API Endpoints

	### GET `/api/models`
	List all available Ollama models.

	Response:
	```json
	{
	"status": "success",
	"models": ["llama2", "codellama", "neural-chat"],
	"count": 3
	}
	```

	### POST `/api/models/pull`
	Pull a model from Ollama.

	Request Body:
	```json
	{
	"name": "llama2"
	}
	```

	Response:
	```json
	{
	"status": "success",
	"model": "llama2"
	}
	```

	### POST `/api/generate`
	Generate text using a model.

	Request Body:
	```json
	{
	"model": "llama2",
	"prompt": "Hello, how are you?",
	"temperature": 0.7,
	"max_tokens": 100
	}
	```

	Response:
	```json
	{
	"status": "success",
	"response": "Hello! I'm doing well, thank you for asking...",
	"model": "llama2",
	"usage": {
	"prompt_tokens": 7,
	"completion_tokens": 15,
	"total_tokens": 22
	}
	}
	```

	### GET `/health`
	Health check endpoint.

	Response:
	```json
	{
	"status": "healthy",
	"ollama_connection": "connected",
	"available_models": 3
	}
	```

	## 🔧 Configuration

	### Environment Variables

	- `OLLAMA_BASE_URL`: URL to your Ollama instance (default: `http://localhost:11434` - Ollama runs inside this Space!)
	- `MODELS_DIR`: Directory for storing models (default: `/models`)
	- `ALLOWED_MODELS`: Comma-separated list of allowed models (default: all models)

	Note: This Space now includes Ollama installed directly inside it, so you don't need an external Ollama instance!

	### Supported Models

	By default, the following models are allowed:
	- `llama2`
	- `llama2:13b`
	- `llama2:70b`
	- `codellama`
	- `neural-chat`

	You can customize this list by setting the `ALLOWED_MODELS` environment variable.

	## 🌐 Integration with OpenWebUI

	This Space is designed to work seamlessly with OpenWebUI. You can:

	1. Use this Space as a backend API for OpenWebUI
	2. Configure OpenWebUI to connect to this Space's endpoints
	3. Enjoy a full chat interface with your local Ollama models

	## 🐳 Docker Support

	The Space includes a Dockerfile for containerized deployment:

	```bash
	# Build the image
	docker build -t ollama-space .

	# Run the container
	docker run -p 7860:7860 -e OLLAMA_BASE_URL=http://host.docker.internal:11434 ollama-space
	```

	## 🔒 Security Considerations

	- The Space only allows access to models specified in `ALLOWED_MODELS`
	- All API endpoints are publicly accessible (consider adding authentication for production use)
	- The Space connects to your Ollama instance - ensure proper network security

	## 🚨 Troubleshooting

	### Common Issues

	1. Connection to Ollama failed: Check if Ollama is running and accessible
	2. Model not found: Ensure the model is available in your Ollama instance
	3. Timeout errors: Large models may take time to load - increase timeout values

	### Health Check

	Use the `/health` endpoint to monitor the Space's status and Ollama connection.

	## 📝 License

	This project is open source and available under the MIT License.

	## 🤝 Contributing

	Contributions are welcome! Please feel free to submit a Pull Request.

	## 📞 Support

	If you encounter any issues or have questions, please open an issue on the repository.