Lutalica
Lutalica
AI & ML interests
Computer vision, Image Processing
Recent Activity
commented on
a paper
10 days ago
One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy
Gradient
commented on
a paper
6 months ago
Reinforcement Learning for Reasoning in Large Language Models with One
Training Example