---
license: apache-2.0
language:
- en
base_model:
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
---
# Model Card for ParaThinker-1.5B

ParaThinker-1.5B is a 1.5 billion parameter language model designed for efficient mathematical reasoning through native parallel thinking. Built upon the DeepSeek-R1-Distill-Qwen-1.5B base model, it introduces specialized training to support up to 8 parallel reasoning paths thinking, leveraging KV-cache reuse via PagedAttention in vLLM. This model is detailed in our paper: [ParaThinker: Native Parallel Thinking as a New
Paradigm to Scale LLM Test-time Compute](https://arxiv.org/pdf/2509.04475)

This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).

## Model Details

### Model Description

ParaThinker-1.5B enhances small-scale LLMs by enabling parallel reasoning paths with minimal latency overhead. It uses special tokens (`<think1>` ~ `<think8>`) to boost thought diversity and a summarization template for coherent final answers. The model excels in math reasoning tasks, achieving much higher accuracy than sequential baselines on benchmarks like AIME.

- **Developed by:** Hao Wen, Yifan Su et al.
- **Model type:** Causal Language Model
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from model:** [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)

### Model Sources

<!-- Provide the basic links for the model. -->

- **Github Repository:** https://github.com/MobileLLM/ParaThinker
- **Paper:** https://arxiv.org/pdf/2509.04475

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

ParaThinker-1.5B is intended for mathematical reasoning tasks, such as solving problems from AIME, AMC, or MATH-500 datasets. It can be used directly with the vLLM-based ParaThinker inference engine to generate diverse reasoning paths and a summarized final answer. See [docs](https://github.com/MobileLLM/ParaThinker) to learn more.

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

Evaluated on:

- AIME 2024
- AIME 2025
- AMC 2023
- MATH-500

#### Factors

Performance was disaggregated by:

- Number of parallel paths (2, 4, 8)
- Token budget (16K per path for ParaThinker)
- Task complexity

#### Metrics

- **Pass@1**: Accuracy of the first generated answer.

### Results

| Benchmark   | Sequential (32K) | ParaThinker-1.5B (2×16K) | ParaThinker-1.5B (4×16K) | ParaThinker-1.5B (8×16K) |
| ----------- | ---------------- | ------------------------ | ------------------------ | ------------------------ |
| AIME 2024   | 28.3%            | 34.8%                    | 43.3%                    | 48.1%                    |
| AIME 2025   | 20.5%            | 24.2%                    | 26.7%                    | 31.9%                    |
| AMC 2023    | 72.5%            | 73.1%                    | 80.8%                    | 83.1%                    |
| MATH-500    | 85.0%            | 87.5%                    | 88.7%                    | 89.7%                    |
| **Average** | **50.9%**        | **54.9%**                | **59.9%**                | **63.2%**                |

See Section 5 of the [paper](https://arxiv.org/abs/2509.04475) for more details.

## Model Card Contact 📧

Send an email to 8208220105@csu.edu.cn.