TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning
Abstract
TERMINATOR is an early-exit method for large reasoning models that reduces unnecessary computation by identifying optimal reasoning lengths for efficient chain-of-thought reasoning.
Large Reasoning Models (LRMs) achieve impressive performance on complex reasoning tasks via Chain-of-Thought (CoT) reasoning, which enables them to generate intermediate thinking tokens before arriving at the final answer. However, LRMs often suffer from significant overthinking, spending excessive compute time even after the answer is generated early on. Prior work has identified the existence of an optimal reasoning length such that truncating reasoning at this point significantly shortens CoT outputs with virtually no change in performance. However, determining optimal CoT lengths for practical datasets is highly non-trivial as they are fully task and model-dependent. In this paper, we precisely address this and design TERMINATOR, an early-exit strategy for LRMs at inference to mitigate overthinking. The central idea underpinning TERMINATOR is that the first arrival of an LRM's final answer is often predictable, and we leverage these first answer positions to create a novel dataset of optimal reasoning lengths to train TERMINATOR. Powered by this approach, TERMINATOR achieves significant reductions in CoT lengths of 14%-55% on average across four challenging practical datasets: MATH-500, AIME 2025, HumanEval, and GPQA, whilst outperforming current state-of-the-art methods.
Community
Project Page: https://terminator-llm.github.io
Hugging Face Collection: https://huggingface.co/collections/acnagle/terminator
I like the name.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- NEAT: Neuron-Based Early Exit for Large Reasoning Models (2026)
- ESTAR: Early-Stopping Token-Aware Reasoning For Efficient Inference (2026)
- ConPress: Learning Efficient Reasoning from Multi-Question Contextual Pressure (2026)
- One-Token Verification for Reasoning Correctness Estimation (2026)
- BFS-PO: Best-First Search for Large Reasoning Models (2026)
- Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning (2026)
- Learning from Partial Chain-of-Thought via Truncated-Reasoning Self-Distillation (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper