Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs Paper • 2403.00858 • Published Feb 29, 2024 • 1
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding Paper • 2503.10135 • Published Mar 13, 2025
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding Paper • 2602.23881 • Published Feb 27 • 18