# ColPali Hono Proxy Server A high-performance proxy server built with Hono that sits between your Next.js frontend and the ColPali/Vespa backend. This proxy handles caching, rate limiting, CORS, and provides a clean API interface. ## Features - **Image Retrieval**: Serves base64 images from Vespa as actual image files with proper caching - **Search Proxy**: Forwards search requests with result caching - **Chat SSE Proxy**: Handles Server-Sent Events for streaming chat responses - **Rate Limiting**: Protects backend from overload - **Caching**: In-memory cache for search results and images - **Health Checks**: Kubernetes-ready health endpoints - **CORS Handling**: Configurable CORS for frontend integration - **Request Logging**: Detailed request/response logging with request IDs ## Architecture ``` Next.js App (3000) → Hono Proxy (4000) → ColPali Backend (7860) ↘ Vespa Cloud ``` ## API Endpoints ### Search - `POST /api/search` - Search documents ```json { "query": "annual report 2023", "limit": 10, "ranking": "hybrid" } ``` ### Image Retrieval - `GET /api/search/image/:docId/thumbnail` - Get thumbnail image - `GET /api/search/image/:docId/full` - Get full-size image ### Chat - `POST /api/chat` - Stream chat responses (SSE) ```json { "messages": [{"role": "user", "content": "Tell me about..."}], "context": [] } ``` ### Similarity Map - `POST /api/search/similarity-map` - Generate similarity visualization ### Health - `GET /health` - Detailed health status - `GET /health/live` - Liveness probe - `GET /health/ready` - Readiness probe ## Setup ### Development 1. Install dependencies: ```bash npm install ``` 2. Copy environment variables: ```bash cp .env.example .env ``` 3. Update `.env` with your configuration 4. Run in development mode: ```bash npm run dev ``` ### Production 1. Build: ```bash npm run build ``` 2. Run: ```bash npm start ``` ### Docker Build and run with Docker: ```bash docker build -t colpali-hono-proxy . docker run -p 4000:4000 --env-file .env colpali-hono-proxy ``` Or use docker-compose: ```bash docker-compose up ``` ## Environment Variables | Variable | Description | Default | |----------|-------------|---------| | `PORT` | Server port | 4000 | | `BACKEND_URL` | ColPali backend URL | http://localhost:7860 | | `CORS_ORIGIN` | Allowed CORS origin | http://localhost:3000 | | `ENABLE_CACHE` | Enable caching | true | | `CACHE_TTL` | Cache TTL in seconds | 300 | | `RATE_LIMIT_WINDOW` | Rate limit window (ms) | 60000 | | `RATE_LIMIT_MAX` | Max requests per window | 100 | ## Integration with Next.js Update your Next.js app to use the proxy: ```typescript // .env.local NEXT_PUBLIC_API_URL=http://localhost:4000/api // API calls const response = await fetch(`${process.env.NEXT_PUBLIC_API_URL}/search`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ query, limit }) }); ``` ## Caching Strategy - **Search Results**: Cached for 5 minutes (configurable) - **Images**: Cached for 24 hours - **Cache Keys**: Based on query parameters - **Cache Headers**: `X-Cache: HIT/MISS` ## Rate Limiting - Default: 100 requests per minute per IP - Headers included: - `X-RateLimit-Limit` - `X-RateLimit-Remaining` - `X-RateLimit-Reset` ## Monitoring The proxy includes: - Request logging with correlation IDs - Performance timing - Error tracking - Health endpoints for monitoring ## Deployment Options ### Railway/Fly.io ```toml # fly.toml app = "colpali-proxy" primary_region = "ord" [http_service] internal_port = 4000 force_https = true auto_stop_machines = true auto_start_machines = true ``` ### Kubernetes ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: colpali-proxy spec: replicas: 3 template: spec: containers: - name: proxy image: colpali-proxy:latest ports: - containerPort: 4000 livenessProbe: httpGet: path: /health/live port: 4000 readinessProbe: httpGet: path: /health/ready port: 4000 ``` ## Performance - Built with Hono for maximum performance - Efficient streaming for SSE - Connection pooling for backend requests - In-memory caching reduces backend load - Brotli/gzip compression enabled ## Security - Rate limiting prevents abuse - Secure headers enabled - CORS properly configured - Request ID tracking - No sensitive data logging