Guido's picture

6

Guido

ReinforceNow

https://www.reinforcenow.ai

AI & ML interests

Train AI Agents with RL

Organizations

None yet

upvoted a paper 3 months ago

APEX-Agents

Paper • 2601.14242 • Published Jan 20 • 2

upvoted an article 6 months ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

+3

May 24, 2023

•

179

upvoted an article 10 months ago

Article

The 4 Things Qwen-3’s Chat Template Teaches Us

Apr 30, 2025

•

87

upvoted 3 papers 11 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 448

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3, 2025 • 58

DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition

Paper • 2504.21801 • Published Apr 30, 2025 • 5