Spaces:

vk98
/

colpali-backend-api

Running

App Files Files Community

colpali-backend-api / hono-proxy /README.md

vk98

Initial backend deployment - Hono proxy + ColPali embedding API

5dfbe50 20 days ago

preview code

raw

history blame contribute delete

4.57 kB

	# ColPali Hono Proxy Server

	A high-performance proxy server built with Hono that sits between your Next.js frontend and the ColPali/Vespa backend. This proxy handles caching, rate limiting, CORS, and provides a clean API interface.

	## Features

	- Image Retrieval: Serves base64 images from Vespa as actual image files with proper caching
	- Search Proxy: Forwards search requests with result caching
	- Chat SSE Proxy: Handles Server-Sent Events for streaming chat responses
	- Rate Limiting: Protects backend from overload
	- Caching: In-memory cache for search results and images
	- Health Checks: Kubernetes-ready health endpoints
	- CORS Handling: Configurable CORS for frontend integration
	- Request Logging: Detailed request/response logging with request IDs

	## Architecture

	```
	Next.js App (3000) → Hono Proxy (4000) → ColPali Backend (7860)
	↘ Vespa Cloud
	```

	## API Endpoints

	### Search
	- `POST /api/search` - Search documents
	```json
	{
	"query": "annual report 2023",
	"limit": 10,
	"ranking": "hybrid"
	}
	```

	### Image Retrieval
	- `GET /api/search/image/:docId/thumbnail` - Get thumbnail image
	- `GET /api/search/image/:docId/full` - Get full-size image

	### Chat
	- `POST /api/chat` - Stream chat responses (SSE)
	```json
	{
	"messages": [{"role": "user", "content": "Tell me about..."}],
	"context": []
	}
	```

	### Similarity Map
	- `POST /api/search/similarity-map` - Generate similarity visualization

	### Health
	- `GET /health` - Detailed health status
	- `GET /health/live` - Liveness probe
	- `GET /health/ready` - Readiness probe

	## Setup

	### Development

	1. Install dependencies:
	```bash
	npm install
	```

	2. Copy environment variables:
	```bash
	cp .env.example .env
	```

	3. Update `.env` with your configuration

	4. Run in development mode:
	```bash
	npm run dev
	```

	### Production

	1. Build:
	```bash
	npm run build
	```

	2. Run:
	```bash
	npm start
	```

	### Docker

	Build and run with Docker:
	```bash
	docker build -t colpali-hono-proxy .
	docker run -p 4000:4000 --env-file .env colpali-hono-proxy
	```

	Or use docker-compose:
	```bash
	docker-compose up
	```

	## Environment Variables

	\| Variable \| Description \| Default \|
	\|----------\|-------------\|---------\|
	\| `PORT` \| Server port \| 4000 \|
	\| `BACKEND_URL` \| ColPali backend URL \| http://localhost:7860 \|
	\| `CORS_ORIGIN` \| Allowed CORS origin \| http://localhost:3000 \|
	\| `ENABLE_CACHE` \| Enable caching \| true \|
	\| `CACHE_TTL` \| Cache TTL in seconds \| 300 \|
	\| `RATE_LIMIT_WINDOW` \| Rate limit window (ms) \| 60000 \|
	\| `RATE_LIMIT_MAX` \| Max requests per window \| 100 \|

	## Integration with Next.js

	Update your Next.js app to use the proxy:

	```typescript
	// .env.local
	NEXT_PUBLIC_API_URL=http://localhost:4000/api

	// API calls
	const response = await fetch(`${process.env.NEXT_PUBLIC_API_URL}/search`, {
	method: 'POST',
	headers: { 'Content-Type': 'application/json' },
	body: JSON.stringify({ query, limit })
	});
	```

	## Caching Strategy

	- Search Results: Cached for 5 minutes (configurable)
	- Images: Cached for 24 hours
	- Cache Keys: Based on query parameters
	- Cache Headers: `X-Cache: HIT/MISS`

	## Rate Limiting

	- Default: 100 requests per minute per IP
	- Headers included:
	- `X-RateLimit-Limit`
	- `X-RateLimit-Remaining`
	- `X-RateLimit-Reset`

	## Monitoring

	The proxy includes:
	- Request logging with correlation IDs
	- Performance timing
	- Error tracking
	- Health endpoints for monitoring

	## Deployment Options

	### Railway/Fly.io
	```toml
	# fly.toml
	app = "colpali-proxy"
	primary_region = "ord"

	[http_service]
	internal_port = 4000
	force_https = true
	auto_stop_machines = true
	auto_start_machines = true
	```

	### Kubernetes
	```yaml
	apiVersion: apps/v1
	kind: Deployment
	metadata:
	name: colpali-proxy
	spec:
	replicas: 3
	template:
	spec:
	containers:
	- name: proxy
	image: colpali-proxy:latest
	ports:
	- containerPort: 4000
	livenessProbe:
	httpGet:
	path: /health/live
	port: 4000
	readinessProbe:
	httpGet:
	path: /health/ready
	port: 4000
	```

	## Performance

	- Built with Hono for maximum performance
	- Efficient streaming for SSE
	- Connection pooling for backend requests
	- In-memory caching reduces backend load
	- Brotli/gzip compression enabled

	## Security

	- Rate limiting prevents abuse
	- Secure headers enabled
	- CORS properly configured
	- Request ID tracking
	- No sensitive data logging