SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting Paper • 2604.10688 • Published 7 days ago • 25 • 3
SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting Paper • 2604.10688 • Published 7 days ago • 25
SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting Paper • 2604.10688 • Published 7 days ago • 25
TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas Paper • 2603.16448 • Published Mar 17 • 58
Running 96 Unlocking On-Policy Distillation for Any Model Family 📝 96 Visualize on-policy distillation for any model family
When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning Paper • 2505.15400 • Published May 21, 2025 • 23
When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning Paper • 2505.15400 • Published May 21, 2025 • 23