Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better Paper • 2602.05393 • Published Feb 5 • 8
PISA: Piecewise Sparse Attention Is Wiser for Efficient Diffusion Transformers Paper • 2602.01077 • Published Feb 1 • 4
view article Article Backbone-Optimizer Coupling Bias: The Hidden Co-Design Principle Dec 20, 2025 • 4
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster Paper • 2503.09662 • Published Mar 12, 2025 • 33