🌐 CDN Cache Optimizer — OpenEnv RL Environment

An RL environment simulating edge CDN cache management — the exact problem companies like Meta solve at planetary scale. An agent manages a cache of limited size, deciding which files to evict when new content arrives, balancing hit rate, bandwidth efficiency, and thrash avoidance.

🎯 Motivation

Content Delivery Networks serve billions of files daily. Edge servers have limited storage, so they must constantly decide: which cached files to keep, and which to evict? Standard algorithms like LRU aren't optimal — especially when traffic has viral bursts (a file suddenly gets 50x more requests for 20 minutes, then drops back to zero).

A smarter agent can:

Predict viral spikes from queue previews
Avoid evicting high-frequency files
Prevent cache thrashing (evicting then immediately re-requesting)
Maximize bandwidth saved for users

🔧 Environment Description

At each step, a file is requested from the network. If it's already in the cache → cache hit (reward). If not → cache miss, and the agent must decide whether to evict an existing file to make room.

Traffic Model

Steady files: Consistent, cyclical demand
Viral files: Bell-curve spike in popularity, then fade back to baseline

📐 Action & Observation Space

Observation Space

Field	Type	Description
`step`	int	Current episode step
`cache_used_mb`	float	MB currently used
`cache_capacity_mb`	float	Total cache size
`cache_fill_ratio`	float	0.0–1.0 fill level
`cached_files`	List[FileEntry]	All files in cache with metadata
`incoming_file_id`	str	File being requested
`incoming_file_size_mb`	float	Size of incoming file
`incoming_file_is_viral`	bool	Is this file currently viral?
`cache_hit`	bool	Is incoming file already cached?
`recent_hit_rate`	float	Rolling hit rate (last 20 steps)
`time_of_day`	float	Normalized 0.0–1.0 daily cycle
`queue_preview`	List[str]	Next 3 file IDs (prefetch hint)

FileEntry Fields

Field	Type	Description
`file_id`	str	Unique identifier
`size_mb`	float	File size in MB
`request_frequency`	float	Requests since cached
`is_viral`	bool	Currently viral
`last_accessed`	int	Step number of last access

Action Space

Field	Type	Description
`evict_file_id`	str \| null	File to evict (null = no eviction)

Reward Function

Component	Range	Description
`cache_hit_bonus`	+1.0 to +1.5	Hit reward (viral hits = +1.5)
`bandwidth_saved`	+0.0 to +0.2	Reward for bandwidth efficiency
`eviction_penalty`	-0.0 to -0.5	Penalty for evicting popular files
`thrash_penalty`	0.0 or -0.5	Penalty for evicting same file twice
`wasted_capacity_penalty`	-0.0 to -0.3	Penalty for leaving cache empty

📋 Tasks

Task 1: Steady Traffic Cache (Easy)

Cache: 100MB | Files: 30 | Steps: 100
No viral files — steady demand only
Agent learns basic LRU-style eviction
Target hit rate: ≥ 0.60 → score 1.0
Baseline score: ~0.75

Task 2: Mixed Traffic Cache (Medium)

Cache: 80MB | Files: 50 | Steps: 150
20% viral files mixed with steady demand
Agent must handle spikes and prioritize popular content
Score: 70% hit rate + 30% bandwidth
Baseline score: ~0.60

Task 3: Constrained Cache with Viral Bursts (Hard)

Cache: 50MB | Files: 80 | Steps: 200
35% viral files, tight capacity, large file sizes
Agent must predict spikes, avoid thrashing
Score: 50% hit rate + 25% bandwidth + 25% reward quality
Baseline score: ~0.45

Code Repository

Full source: https://github.com/umar-sharif821/cdn-cache-env

Files Included

env/cache.py - DriftCDNEnv environment implementation
server/app.py - OpenEnv FastAPI server
training/train.py - Fine-tuning script
training_results_finetuned.png - Training results chart
baseline_drift.png - Baseline comparison chart

🚀 Setup & Usage

Local Setup

git clone <repo>
cd cdn-cache-env
pip install -r requirements.txt

Run API Server

uvicorn api.main:app --host 0.0.0.0 --port 7860

Run Inference (Baseline Agent)

export API_BASE_URL="https://api.openai.com/v1"
export MODEL_NAME="gpt-4o-mini"
export HF_TOKEN="your_token_here"

python inference.py

Docker

docker build -t cdn-cache-env .
docker run -p 7860:7860 \
  -e API_BASE_URL="https://api.openai.com/v1" \
  -e MODEL_NAME="gpt-4o-mini" \
  -e HF_TOKEN="your_token" \
  cdn-cache-env

🌐 API Endpoints

Method	Endpoint	Description
GET	`/health`	Health check (returns 200)
GET	`/tasks`	List all tasks
POST	`/reset`	Start episode `{"task_id": "task_easy", "seed": 42}`
POST	`/step`	Take action `{"evict_file_id": "file_001" or null}`
GET	`/state`	Full environment state

📊 Baseline Scores

Using the built-in smart_policy (non-LLM baseline):

Task	Hit Rate	Score
Easy	~0.72	~1.00
Medium	~0.61	~0.82
Hard	~0.48	~0.78
Overall		~0.87

📝 Log Format

inference.py emits structured JSON logs:

{"type": "START", "task_id": "task_easy", ...}
{"type": "STEP",  "step": 0, "action": {...}, "reward": 1.0, ...}
{"type": "END",   "total_reward": 87.3, "final_hit_rate": 0.72, "score": 1.0}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support