Building a CDN Cache Optimizer with OpenEnv and RL
For this hackathon, I wanted to build something that felt close to a real infrastructure problem.
A lot of reinforcement learning demos are fun, but they often feel disconnected from systems that engineers actually run in production. I wanted my project to sit closer to that world: networking, latency, logs, cost, traffic spikes, and reliability.
So I built CDN Cache Optimizer.
It is an OpenEnv-compatible environment where an agent learns how to manage an edge CDN cache.
The project is live here:
- GitHub: https://github.com/umar-sharif821/cdn-cache-env-improvedone
- Hugging Face Space: https://huggingface.co/spaces/umar-sharif821/cdn-cache-env-improvedone
The Problem
A CDN edge server has limited storage.
Every time a file is requested, the system has to make a decision:
Should this object stay in cache, should we ignore it, or should we evict something else to make room?
If the file is already cached, the user gets a fast edge response.
If the file is not cached, the request goes back to origin.
That is slower and more expensive.
At small scale, this looks simple.
At internet scale, it becomes a hard optimization problem.
Why This Matters
CDNs serve images, videos, scripts, documents, and application assets to users around the world.
A good cache policy can:
- reduce user latency
- reduce origin load
- save bandwidth
- improve reliability
- avoid unnecessary cache churn
A poor cache policy can do the opposite.
It may evict useful files.
It may cache large files that are rarely requested.
It may miss viral traffic bursts.
It may keep sending users back to origin even when the edge cache could have served them.
This is why cache optimization is an interesting RL problem.
Why Not Just Use LRU?
LRU is simple:
Evict the least recently used file.
That is a strong baseline, and it works well in many cases.
But it has blind spots.
For example:
- A file may be old but about to become popular again.
- A file may be recently used once but not worth storing.
- A viral object may deserve protection even if it has not been in cache for long.
- A large file may consume too much cache space compared to its value.
That is where an agent can do better.
The agent does not just ask:
What was used least recently?
It can ask:
What is most valuable to keep right now?
Project Goal
The goal of this project is not just to train a model.
The goal is to build a complete benchmarkable environment around a realistic CDN caching problem.
That includes:
- an OpenEnv-style environment
- a baseline policy
- a fine-tuned agent policy
- a reward function grounded in latency and cost
- schema drift handling for CDN logs
- Colab reproducibility
- a live Hugging Face demo
- visual comparison between baseline and agent behavior
Live Demo
The Hugging Face Space runs a Gradio app.
The UI lets the judge choose a CDN task and run a benchmark.
It compares:
- Baseline LRU
- Fine-tuned CDN agent
The output shows:
- total reward
- cache hit rate
- bandwidth saved
- hit-rate curve
- baseline-vs-agent comparison chart
Space:
https://huggingface.co/spaces/umar-sharif821/cdn-cache-env-improvedone

