arxiv:2604.03253

Self-Execution Simulation Improves Coding Models

Published on Mar 11

· Submitted by

Gallil Maimon on Apr 7

Upvote

Authors:

Gallil Maimon ,

Gal Cohen ,

Abstract

Code large language models can be trained to simulate program execution step-by-step, improving competitive programming performance through supervised fine-tuning and reinforcement learning with verifiable rewards.

AI-generated summary

A promising research direction in enabling LLMs to generate consistently correct code involves addressing their inability to properly estimate program execution, particularly for code they generate. In this work, we demonstrate that Code LLMs can be trained to simulate program execution in a step-by-step manner and that this capability can be leveraged to improve competitive programming performance. Our approach combines supervised fine-tuning on natural language execution traces, textual explanations grounded in true execution, with reinforcement learning using verifiable rewards. We introduce two complementary objectives: output prediction given code and inputs, and solving competitive programming tasks with either ground-truth or self-predicted execution feedback. These objectives enable models to perform self-verification over multiple candidate solutions, and iterative self-fixing by simulating test execution. Across multiple competitive programming benchmarks, our method yields consistent improvements over standard reasoning approaches. We further present ablations and analysis to elucidate the role of execution simulation and its limitations.

View arXiv page View PDF Add to collection

Community

gallilmaimon

Paper author Paper submitter about 14 hours ago

🚨New paper🚨 Self-Execution Simulation Improves Coding Models

Current reasoning CodeLMs before providing an answer to programming tasks.

We show that CodeLMs can be post-trained to explicitly simulate execution of tests in order to verify and fix their proposed solutions, leading to additional gains!