The Scaling Properties of Implicit Deductive Reasoning in Transformers
Abstract
Deep Transformers with bidirectional masking exhibit implicit deductive reasoning capabilities comparable to explicit chain-of-thought methods across various graph structures and problem sizes.
We investigate the scaling properties of implicit deductive reasoning over Horn clauses in depth-bounded Transformers. By systematically decorrelating provability from spurious features and enforcing algorithmic alignment, we find that in sufficiently deep models with a bidirectional prefix mask, implicit reasoning approaches explicit CoT performance across graph topologies and problem widths, though CoT remains necessary for depth extrapolation.
Community
code, datasets and models, although reproducible from paper, will be made public upon publication. For joint research, contact {enrico.vompa}@gmail.com as I'm open for collaboration
Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/the-scaling-properties-of-implicit-deductive-reasoning-in-transformers-274-3e1290bd
Covers the executive summary, detailed methodology, and practical applications.
Get this paper in your agent:
hf papers read 2605.04330 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper