Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving Paper • 2507.23726 • Published 9 days ago • 103
Running on Zero 1.07k 1.07k InfiniteYou-FLUX 📸 Flexible Photo Recrafting While Preserving Your Identity
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published Jun 23 • 28
Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning Paper • 2506.03136 • Published Jun 3 • 24
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning Paper • 2504.13914 • Published Apr 10 • 4
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning Paper • 2504.13914 • Published Apr 10 • 4
Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric Paper • 2502.17184 • Published Feb 24 • 1
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models Paper • 2505.07591 • Published May 12 • 11
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models Paper • 2303.10420 • Published Mar 18, 2023 • 1
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective Paper • 2502.17262 • Published Feb 24 • 22
MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion Paper • 2502.04235 • Published Feb 6 • 22
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published Jan 20 • 107
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use Paper • 2501.02506 • Published Jan 5 • 11