Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Hanning Zhang's picture
10 5

Hanning Zhang

HanningZhang
circulartext's profile picture RogerZhuo's profile picture
·

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago
CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents
upvoted a paper about 1 month ago
GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving
upvoted a paper about 1 month ago
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning
View all activity

Organizations

RLHFlow's profile picture mytestdpo's profile picture ScaleBio Baseline's profile picture UIUC ScaleML Lab's profile picture

authored a paper 7 months ago

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5 • 25
authored a paper 9 months ago

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 83
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs