Running Featured 70 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 70 Who needs 1T parameters? Olympiad proofs with a 4B model
KV Cache Recycling to Expand Usable Context Capacity in Low Parameter LLMs Paper • 2512.11851 • Published Dec 4, 2025
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated 4 days ago • 565k • 2.54k