arxiv:2604.13740
Michal Valko
AI & ML interests
large language models, reasoning, fine-tuning, test-time computation, reinforcement learning with human feedback, world models
Recent Activity
updated a dataset 2 days ago
misovalko/my-research-papers authored a paper 3 days ago
Spectral Thompson sampling authored a paper 3 days ago
Covariance-adapting algorithm for semi-bandits with application to sparse rewards