Spaces:
Running
Running
A newer version of the Gradio SDK is available:
5.42.0
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This is a Hugging Face Spaces application that provides text embeddings using 15+ state-of-the-art embedding models including Nomic, BGE, Snowflake Arctic, IBM Granite, and sentence-transformers models. It runs on CPU and provides both a web interface and API endpoints for generating text embeddings with model selection.
Key Commands
Local Development
# Install dependencies
pip install -r requirements.txt
# Run the application locally
python app.py
Git Operations
# Push to Hugging Face Spaces (requires authentication)
git push origin main
# Note: May need to authenticate with:
huggingface-cli login
Architecture
The application consists of a single app.py
file with:
- Model Configuration: Dictionary of 15+ embedding models with trust_remote_code settings (lines 10-26)
- Model Caching: Dynamic model loading with caching to avoid reloading (lines 32-42)
- FastAPI App: Direct HTTP endpoints at
/embed
and/models
(lines 44, 57-102) - Embedding Function: Multi-model wrapper that calls model.encode() (lines 49-53)
- Gradio Interface: UI with model dropdown selector and API endpoint (lines 106-135)
- Dual Server: FastAPI mounted with Gradio using uvicorn (lines 214-219)
Important Configuration Details
- Queue: Hugging Face Spaces enforces queuing at infrastructure level, even without
.queue()
in code - CPU Mode: Explicitly set to CPU to avoid GPU requirements
- Trust Remote Code: Only predefined models in MODELS dict allow
trust_remote_code=True
- Any HF Model: API accepts any Hugging Face model name but uses
trust_remote_code=False
for unlisted models - API Access: Direct HTTP available via FastAPI endpoints
API Usage
Two options for API access:
- Direct FastAPI endpoint (no queue):
# List models
curl https://ipepe-nomic-embeddings.hf.space/models
# Generate embedding with specific model
curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
-H "Content-Type: application/json" \
-d '{"text": "your text", "model": "mixedbread-ai/mxbai-embed-large-v1"}'
- Gradio client (handles queue automatically):
from gradio_client import Client
client = Client("ipepe/nomic-embeddings")
result = client.predict("text to embed", "model-name", api_name="/predict")
Deployment Notes
- Deployed on Hugging Face Spaces at https://huggingface.co/spaces/ipepe/nomic-embeddings
- Runs on port 7860
- Uses Gradio 4.36.1 (newer versions available)
- PyTorch configured for CPU-only via
--extra-index-url
in requirements.txt
Development Constraints
- There is no python installed locally, everything needs to be deployed to hugging face first