fangwu97 nielsr HF Staff commited on
Commit
e77dab2
·
verified ·
1 Parent(s): 82c9625

Add pipeline tag and hyperlink paper in model card (#1)

Browse files

- Add pipeline tag and hyperlink paper in model card (6c5c3ea916ed0ad92bad444fdb4144282826e18c)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +12 -43
README.md CHANGED
@@ -1,7 +1,11 @@
1
  ---
 
 
2
  language:
3
  - en
4
  library_name: transformers
 
 
5
  tags:
6
  - reasoning
7
  - reinforcement-learning
@@ -9,67 +13,30 @@ tags:
9
  - mcts
10
  - math
11
  - iclr-2026
12
- license: apache-2.0
13
- datasets:
14
- - DeepMath-103K
15
  model-index:
16
  - name: DeepSearch-1.5B
17
  results:
18
  - task:
19
- name: Mathematical Reasoning
20
  type: text-generation
 
21
  dataset:
22
  name: AIME 2024
23
  type: text
24
  metrics:
25
  - type: avg@32
26
  value: 53.65
27
- - task:
28
- name: Mathematical Reasoning
29
- type: text-generation
30
- dataset:
31
- name: AIME 2025
32
- type: text
33
- metrics:
34
  - type: avg@32
35
  value: 35.42
36
- - task:
37
- name: Mathematical Reasoning
38
- type: text-generation
39
- dataset:
40
- name: AMC 2023
41
- type: text
42
- metrics:
43
  - type: avg@32
44
  value: 90.39
45
- - task:
46
- name: Mathematical Reasoning
47
- type: text-generation
48
- dataset:
49
- name: MATH500
50
- type: text
51
- metrics:
52
  - type: avg@32
53
  value: 92.53
54
- - task:
55
- name: Mathematical Reasoning
56
- type: text-generation
57
- dataset:
58
- name: Minerva
59
- type: text
60
- metrics:
61
  - type: avg@32
62
- value: 40.00
63
- - task:
64
- name: Mathematical Reasoning
65
- type: text-generation
66
- dataset:
67
- name: Olympiad
68
- type: text
69
- metrics:
70
  - type: avg@32
71
  value: 65.72
72
  ---
 
73
  <div align="center">
74
  <span style="font-family: default; font-size: 1.5em;">🚀 DeepSearch-1.5B</span>
75
  </div>
@@ -88,7 +55,7 @@ This model achieves **state-of-the-art accuracy among 1.5B reasoning models** wh
88
 
89
  - **Developed by**: Fang Wu\*, Weihao Xuan\*, Heli Qi\*, Ximing Lu, Aaron Tu, Li Erran Li, Yejin Choi
90
  - **Institutional affiliations**: Stanford University, University of Tokyo, RIKEN AIP, University of Washington, UC Berkeley, Amazon AWS, Columbia University
91
- - **Paper**: DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
92
  - **Base Model**: Nemotron-Research-Reasoning-Qwen-1.5B v2
93
  - **Parameters**: 1.5B
94
  - **Framework**: veRL
@@ -114,7 +81,8 @@ from transformers import AutoTokenizer
114
  def convert_question_to_messages(question: str):
115
  messages = [
116
  {"role": "user",
117
- "content": question + " Let's think step by step and output the final answer within \\boxed{}."}
 
118
  ]
119
  return messages
120
 
@@ -155,7 +123,7 @@ print(response)
155
  | Olympiad | 64.69 | **65.72** |
156
  | **Average** | 61.70 | **62.95** |
157
 
158
- DeepSearch improves average accuracy by **+1.25 points** over the best prior 1.5B model, while using **5.7× fewer GPU hours**.
159
 
160
 
161
  ## Training
@@ -191,3 +159,4 @@ DeepSearch improves average accuracy by **+1.25 points** over the best prior 1.5
191
  primaryClass = {cs.AI},
192
  doi = {10.48550/arXiv.2509.25454},
193
  }
 
 
1
  ---
2
+ datasets:
3
+ - DeepMath-103K
4
  language:
5
  - en
6
  library_name: transformers
7
+ license: apache-2.0
8
+ pipeline_tag: text-generation
9
  tags:
10
  - reasoning
11
  - reinforcement-learning
 
13
  - mcts
14
  - math
15
  - iclr-2026
 
 
 
16
  model-index:
17
  - name: DeepSearch-1.5B
18
  results:
19
  - task:
 
20
  type: text-generation
21
+ name: Mathematical Reasoning
22
  dataset:
23
  name: AIME 2024
24
  type: text
25
  metrics:
26
  - type: avg@32
27
  value: 53.65
 
 
 
 
 
 
 
28
  - type: avg@32
29
  value: 35.42
 
 
 
 
 
 
 
30
  - type: avg@32
31
  value: 90.39
 
 
 
 
 
 
 
32
  - type: avg@32
33
  value: 92.53
 
 
 
 
 
 
 
34
  - type: avg@32
35
+ value: 40.0
 
 
 
 
 
 
 
36
  - type: avg@32
37
  value: 65.72
38
  ---
39
+
40
  <div align="center">
41
  <span style="font-family: default; font-size: 1.5em;">🚀 DeepSearch-1.5B</span>
42
  </div>
 
55
 
56
  - **Developed by**: Fang Wu\*, Weihao Xuan\*, Heli Qi\*, Ximing Lu, Aaron Tu, Li Erran Li, Yejin Choi
57
  - **Institutional affiliations**: Stanford University, University of Tokyo, RIKEN AIP, University of Washington, UC Berkeley, Amazon AWS, Columbia University
58
+ - **Paper**: [DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search](https://huggingface.co/papers/2509.25454)
59
  - **Base Model**: Nemotron-Research-Reasoning-Qwen-1.5B v2
60
  - **Parameters**: 1.5B
61
  - **Framework**: veRL
 
81
  def convert_question_to_messages(question: str):
82
  messages = [
83
  {"role": "user",
84
+ "content": question + " Let's think step by step and output the final answer within \\boxed{}. \
85
+ "}
86
  ]
87
  return messages
88
 
 
123
  | Olympiad | 64.69 | **65.72** |
124
  | **Average** | 61.70 | **62.95** |
125
 
126
+ DeepSearch improves average accuracy by **+1.25 points** over the best prior 1.5B model, while using **5.7× more GPU hours**.
127
 
128
 
129
  ## Training
 
159
  primaryClass = {cs.AI},
160
  doi = {10.48550/arXiv.2509.25454},
161
  }
162
+ ```