Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
dkkloimwieder 's Collections
Paper
Mdl

Paper

updated 2 days ago
Upvote
1

  • THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning

    Paper • 2509.13761 • Published 29 days ago • 16

  • Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

    Paper • 2509.25849 • Published 16 days ago • 45

  • Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models

    Paper • 2510.03561 • Published 12 days ago • 23

  • Less is More: Recursive Reasoning with Tiny Networks

    Paper • 2510.04871 • Published 10 days ago • 385

  • Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning

    Paper • 2510.03259 • Published 20 days ago • 53

  • BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

    Paper • 2510.08697 • Published 7 days ago • 27
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs