🌐 CDN Cache Optimizer β€” OpenEnv RL Environment

An RL environment simulating edge CDN cache management β€” the exact problem companies like Meta solve at planetary scale. An agent manages a cache of limited size, deciding which files to evict when new content arrives, balancing hit rate, bandwidth efficiency, and thrash avoidance.


🎯 Motivation

Content Delivery Networks serve billions of files daily. Edge servers have limited storage, so they must constantly decide: which cached files to keep, and which to evict? Standard algorithms like LRU aren't optimal β€” especially when traffic has viral bursts (a file suddenly gets 50x more requests for 20 minutes, then drops back to zero).

A smarter agent can:

  • Predict viral spikes from queue previews
  • Avoid evicting high-frequency files
  • Prevent cache thrashing (evicting then immediately re-requesting)
  • Maximize bandwidth saved for users

πŸ”§ Environment Description

At each step, a file is requested from the network. If it's already in the cache β†’ cache hit (reward). If not β†’ cache miss, and the agent must decide whether to evict an existing file to make room.

Traffic Model

  • Steady files: Consistent, cyclical demand
  • Viral files: Bell-curve spike in popularity, then fade back to baseline

πŸ“ Action & Observation Space

Observation Space

Field Type Description
step int Current episode step
cache_used_mb float MB currently used
cache_capacity_mb float Total cache size
cache_fill_ratio float 0.0–1.0 fill level
cached_files List[FileEntry] All files in cache with metadata
incoming_file_id str File being requested
incoming_file_size_mb float Size of incoming file
incoming_file_is_viral bool Is this file currently viral?
cache_hit bool Is incoming file already cached?
recent_hit_rate float Rolling hit rate (last 20 steps)
time_of_day float Normalized 0.0–1.0 daily cycle
queue_preview List[str] Next 3 file IDs (prefetch hint)

FileEntry Fields

Field Type Description
file_id str Unique identifier
size_mb float File size in MB
request_frequency float Requests since cached
is_viral bool Currently viral
last_accessed int Step number of last access

Action Space

Field Type Description
evict_file_id str | null File to evict (null = no eviction)

Reward Function

Component Range Description
cache_hit_bonus +1.0 to +1.5 Hit reward (viral hits = +1.5)
bandwidth_saved +0.0 to +0.2 Reward for bandwidth efficiency
eviction_penalty -0.0 to -0.5 Penalty for evicting popular files
thrash_penalty 0.0 or -0.5 Penalty for evicting same file twice
wasted_capacity_penalty -0.0 to -0.3 Penalty for leaving cache empty

πŸ“‹ Tasks

Task 1: Steady Traffic Cache (Easy)

  • Cache: 100MB | Files: 30 | Steps: 100
  • No viral files β€” steady demand only
  • Agent learns basic LRU-style eviction
  • Target hit rate: β‰₯ 0.60 β†’ score 1.0
  • Baseline score: ~0.75

Task 2: Mixed Traffic Cache (Medium)

  • Cache: 80MB | Files: 50 | Steps: 150
  • 20% viral files mixed with steady demand
  • Agent must handle spikes and prioritize popular content
  • Score: 70% hit rate + 30% bandwidth
  • Baseline score: ~0.60

Task 3: Constrained Cache with Viral Bursts (Hard)

  • Cache: 50MB | Files: 80 | Steps: 200
  • 35% viral files, tight capacity, large file sizes
  • Agent must predict spikes, avoid thrashing
  • Score: 50% hit rate + 25% bandwidth + 25% reward quality
  • Baseline score: ~0.45

Code Repository

Full source: https://github.com/umar-sharif821/cdn-cache-env

Files Included

  • env/cache.py - DriftCDNEnv environment implementation
  • server/app.py - OpenEnv FastAPI server
  • training/train.py - Fine-tuning script
  • training_results_finetuned.png - Training results chart
  • baseline_drift.png - Baseline comparison chart

πŸš€ Setup & Usage

Local Setup

git clone <repo>
cd cdn-cache-env
pip install -r requirements.txt

Run API Server

uvicorn api.main:app --host 0.0.0.0 --port 7860

Run Inference (Baseline Agent)

export API_BASE_URL="https://api.openai.com/v1"
export MODEL_NAME="gpt-4o-mini"
export HF_TOKEN="your_token_here"

python inference.py

Docker

docker build -t cdn-cache-env .
docker run -p 7860:7860 \
  -e API_BASE_URL="https://api.openai.com/v1" \
  -e MODEL_NAME="gpt-4o-mini" \
  -e HF_TOKEN="your_token" \
  cdn-cache-env

🌐 API Endpoints

Method Endpoint Description
GET /health Health check (returns 200)
GET /tasks List all tasks
POST /reset Start episode {"task_id": "task_easy", "seed": 42}
POST /step Take action {"evict_file_id": "file_001" or null}
GET /state Full environment state

πŸ“Š Baseline Scores

Using the built-in smart_policy (non-LLM baseline):

Task Hit Rate Score
Easy ~0.72 ~1.00
Medium ~0.61 ~0.82
Hard ~0.48 ~0.78
Overall ~0.87

πŸ“ Log Format

inference.py emits structured JSON logs:

{"type": "START", "task_id": "task_easy", ...}
{"type": "STEP",  "step": 0, "action": {...}, "reward": 1.0, ...}
{"type": "END",   "total_reward": 87.3, "final_hit_rate": 0.72, "score": 1.0}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support