Spaces:
Running
Running
# ColPali Hono Proxy Server | |
A high-performance proxy server built with Hono that sits between your Next.js frontend and the ColPali/Vespa backend. This proxy handles caching, rate limiting, CORS, and provides a clean API interface. | |
## Features | |
- **Image Retrieval**: Serves base64 images from Vespa as actual image files with proper caching | |
- **Search Proxy**: Forwards search requests with result caching | |
- **Chat SSE Proxy**: Handles Server-Sent Events for streaming chat responses | |
- **Rate Limiting**: Protects backend from overload | |
- **Caching**: In-memory cache for search results and images | |
- **Health Checks**: Kubernetes-ready health endpoints | |
- **CORS Handling**: Configurable CORS for frontend integration | |
- **Request Logging**: Detailed request/response logging with request IDs | |
## Architecture | |
``` | |
Next.js App (3000) β Hono Proxy (4000) β ColPali Backend (7860) | |
β Vespa Cloud | |
``` | |
## API Endpoints | |
### Search | |
- `POST /api/search` - Search documents | |
```json | |
{ | |
"query": "annual report 2023", | |
"limit": 10, | |
"ranking": "hybrid" | |
} | |
``` | |
### Image Retrieval | |
- `GET /api/search/image/:docId/thumbnail` - Get thumbnail image | |
- `GET /api/search/image/:docId/full` - Get full-size image | |
### Chat | |
- `POST /api/chat` - Stream chat responses (SSE) | |
```json | |
{ | |
"messages": [{"role": "user", "content": "Tell me about..."}], | |
"context": [] | |
} | |
``` | |
### Similarity Map | |
- `POST /api/search/similarity-map` - Generate similarity visualization | |
### Health | |
- `GET /health` - Detailed health status | |
- `GET /health/live` - Liveness probe | |
- `GET /health/ready` - Readiness probe | |
## Setup | |
### Development | |
1. Install dependencies: | |
```bash | |
npm install | |
``` | |
2. Copy environment variables: | |
```bash | |
cp .env.example .env | |
``` | |
3. Update `.env` with your configuration | |
4. Run in development mode: | |
```bash | |
npm run dev | |
``` | |
### Production | |
1. Build: | |
```bash | |
npm run build | |
``` | |
2. Run: | |
```bash | |
npm start | |
``` | |
### Docker | |
Build and run with Docker: | |
```bash | |
docker build -t colpali-hono-proxy . | |
docker run -p 4000:4000 --env-file .env colpali-hono-proxy | |
``` | |
Or use docker-compose: | |
```bash | |
docker-compose up | |
``` | |
## Environment Variables | |
| Variable | Description | Default | | |
|----------|-------------|---------| | |
| `PORT` | Server port | 4000 | | |
| `BACKEND_URL` | ColPali backend URL | http://localhost:7860 | | |
| `CORS_ORIGIN` | Allowed CORS origin | http://localhost:3000 | | |
| `ENABLE_CACHE` | Enable caching | true | | |
| `CACHE_TTL` | Cache TTL in seconds | 300 | | |
| `RATE_LIMIT_WINDOW` | Rate limit window (ms) | 60000 | | |
| `RATE_LIMIT_MAX` | Max requests per window | 100 | | |
## Integration with Next.js | |
Update your Next.js app to use the proxy: | |
```typescript | |
// .env.local | |
NEXT_PUBLIC_API_URL=http://localhost:4000/api | |
// API calls | |
const response = await fetch(`${process.env.NEXT_PUBLIC_API_URL}/search`, { | |
method: 'POST', | |
headers: { 'Content-Type': 'application/json' }, | |
body: JSON.stringify({ query, limit }) | |
}); | |
``` | |
## Caching Strategy | |
- **Search Results**: Cached for 5 minutes (configurable) | |
- **Images**: Cached for 24 hours | |
- **Cache Keys**: Based on query parameters | |
- **Cache Headers**: `X-Cache: HIT/MISS` | |
## Rate Limiting | |
- Default: 100 requests per minute per IP | |
- Headers included: | |
- `X-RateLimit-Limit` | |
- `X-RateLimit-Remaining` | |
- `X-RateLimit-Reset` | |
## Monitoring | |
The proxy includes: | |
- Request logging with correlation IDs | |
- Performance timing | |
- Error tracking | |
- Health endpoints for monitoring | |
## Deployment Options | |
### Railway/Fly.io | |
```toml | |
# fly.toml | |
app = "colpali-proxy" | |
primary_region = "ord" | |
[http_service] | |
internal_port = 4000 | |
force_https = true | |
auto_stop_machines = true | |
auto_start_machines = true | |
``` | |
### Kubernetes | |
```yaml | |
apiVersion: apps/v1 | |
kind: Deployment | |
metadata: | |
name: colpali-proxy | |
spec: | |
replicas: 3 | |
template: | |
spec: | |
containers: | |
- name: proxy | |
image: colpali-proxy:latest | |
ports: | |
- containerPort: 4000 | |
livenessProbe: | |
httpGet: | |
path: /health/live | |
port: 4000 | |
readinessProbe: | |
httpGet: | |
path: /health/ready | |
port: 4000 | |
``` | |
## Performance | |
- Built with Hono for maximum performance | |
- Efficient streaming for SSE | |
- Connection pooling for backend requests | |
- In-memory caching reduces backend load | |
- Brotli/gzip compression enabled | |
## Security | |
- Rate limiting prevents abuse | |
- Secure headers enabled | |
- CORS properly configured | |
- Request ID tracking | |
- No sensitive data logging |