Glyph: Scaling Context Windows via Visual-Text Compression Paper • 2510.17800 • Published Oct 20 • 67
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published Oct 20 • 119
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published Oct 22 • 112
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published Oct 22 • 59
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts Paper • 2510.19363 • Published Oct 22 • 60
SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark Paper • 2402.05138 • Published Feb 6, 2024 • 2
MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training Paper • 2510.12831 • Published Oct 12 • 4
Trace Anything: Representing Any Video in 4D via Trajectory Fields Paper • 2510.13802 • Published Oct 15 • 30
FlashWorld: High-quality 3D Scene Generation within Seconds Paper • 2510.13678 • Published Oct 15 • 70
The Role of Computing Resources in Publishing Foundation Model Research Paper • 2510.13621 • Published Oct 15 • 16
MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training Paper • 2510.12831 • Published Oct 12 • 4
MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training Paper • 2510.12831 • Published Oct 12 • 4 • 2
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 174
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1 • 108
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation Paper • 2509.15194 • Published Sep 18 • 33