-
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 29 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 43 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 162
Yenson
yenson-lau
·
AI & ML interests
None yet
Recent Activity
updated a collection 5 days ago
Papers updated a collection 2 months ago
Starred updated a collection 5 months ago
StarredOrganizations
Papers
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 135 -
Magistral
Paper • 2506.10910 • Published • 67 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
Starred
-
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 29 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 43 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 162
Papers
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 135 -
Magistral
Paper • 2506.10910 • Published • 67 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
datasets 0
None public yet