Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Chevolier 's Collections
Video Generation
Multimodal
LLM
Agent

LLM

updated 2 days ago
Upvote
-

  • Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning

    Paper • 2510.03259 • Published 22 days ago • 54

  • Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

    Paper • 2510.07242 • Published 10 days ago • 30

  • First Try Matters: Revisiting the Role of Reflection in Reasoning Models

    Paper • 2510.08308 • Published 9 days ago • 24

  • Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

    Paper • 2510.03222 • Published 15 days ago • 43

  • Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

    Paper • 2510.11052 • Published 5 days ago • 47

  • RLFR: Extending Reinforcement Learning for LLMs with Flow Environment

    Paper • 2510.10201 • Published 7 days ago • 35

  • Making Mathematical Reasoning Adaptive

    Paper • 2510.04617 • Published 12 days ago • 22

  • Demystifying Reinforcement Learning in Agentic Reasoning

    Paper • 2510.11701 • Published 5 days ago • 27

  • Are Large Reasoning Models Interruptible?

    Paper • 2510.11713 • Published 5 days ago • 1

  • QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

    Paper • 2510.11696 • Published 5 days ago • 154
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs