Einstein Fields: A Neural Perspective To Computational General Relativity Paper • 2507.11589 • Published 24 days ago • 7
A Systematic Analysis of Hybrid Linear Attention Paper • 2507.06457 • Published about 1 month ago • 22
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • about 1 month ago • 639
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published May 30 • 267
Expect the Unexpected: FailSafe Long Context QA for Finance Paper • 2502.06329 • Published Feb 10 • 132
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Paper • 2502.03544 • Published Feb 5 • 44
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 283
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published Dec 9, 2024 • 90
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 49
Adaptive Decoding via Latent Preference Optimization Paper • 2411.09661 • Published Nov 14, 2024 • 10
view article Article Fine-tuning LLMs with Singular Value Decomposition By fractalego • Jun 2, 2024 • 12
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse Paper • 2410.21333 • Published Oct 27, 2024 • 12
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation Paper • 2410.18565 • Published Oct 24, 2024 • 47
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30, 2024 • 55