QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 176
StreamingVLM: Real-Time Understanding for Infinite Video Streams Paper • 2510.09608 • Published Oct 10, 2025 • 50
Round Attention: A Novel Round-Level Attention Mechanism to Accelerate LLM Inference Paper • 2502.15294 • Published Feb 21, 2025 • 1
From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval Paper • 2505.23059 • Published May 29, 2025 • 13
Multi Agent based Medical Assistant for Edge Devices Paper • 2503.05397 • Published Mar 7, 2025 • 8
kuleshov-group/bd3lm-owt-block_size16 Text Generation • 0.2B • Updated Apr 13, 2025 • 1.22k • 16