Block: Balancing Load in LLM Serving with Context, Knowledge and Predictive Scheduling Paper • 2508.03611 • Published Aug 5, 2025 • 1
Aragog: Just-in-Time Model Routing for Scalable Serving of Agentic Workflows Paper • 2511.20975 • Published Nov 26, 2025 • 1