|
--- |
|
base_model: |
|
- Qwen/Qwen2.5-7B-Instruct |
|
datasets: |
|
- liuwenhan/reasonrank_data_sft |
|
- liuwenhan/reasonrank_data_rl |
|
- liuwenhan/reasonrank_data_13k |
|
language: |
|
- en |
|
license: mit |
|
pipeline_tag: text-ranking |
|
library_name: transformers |
|
tags: |
|
- qwen |
|
- reranker |
|
- passage-ranking |
|
--- |
|
|
|
# ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability |
|
|
|
<p align="center"> |
|
<a href="https://arxiv.org/pdf/2508.07050" target="_blank"><img src="https://img.shields.io/badge/Paper-arXiv-b5212f.svg?logo=arxiv"></a> |
|
<a href="https://github.com/8421BCD/ReasonRank" target="_blank"><img src="https://img.shields.io/badge/GitHub-Repo-181717.svg?logo=github"></a> |
|
<a href="https://brightbenchmark.github.io/" target="_blank"><img src="https://img.shields.io/badge/Project%20Page-BRIGHT-blue.svg"></a> |
|
<a href="https://opensource.org/licenses/MIT"><img alt="License" src="https://img.shields.io/badge/LICENSE-MIT-green.svg"></a> |
|
</p> |
|
|
|
<p align="center"> |
|
๐ค <a href="https://huggingface.co/liuwenhan/reasonrank-7B" target="_blank">reasonrank-7B</a> ๏ฝ |
|
๐ค <a href="https://huggingface.co/liuwenhan/reasonrank-32B" target="_blank">reasonrank-32B</a> |
|
</p> |
|
<p align="center"> |
|
๐ค <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k" target="_blank">reasonrank_data_13k</a> ๏ฝ |
|
๐ค <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft" target="_blank">reasonrank_data_sft</a> ๏ฝ |
|
๐ค <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl" target="_blank">reasonrank_data_rl</a> |
|
</p> |
|
<h5 align="center"> If you like our project, please give us a star โญ on GitHub.</h5> |
|
|
|
## ๐ฃ Latest News |
|
- **[Aug 9, 2025]**: ๐ Our ReasonRank (32B) has achieved **SOTA performance 40.8** on **[BRIGHT leaderboard](https://brightbenchmark.github.io/)**! |
|
- **[Aug 9, 2025]**: ๐ We uploaded our paper to the **[arXiv](https://arxiv.org/pdf/2508.07050)** and **[Hugging Face](https://huggingface.co/papers/2508.07050)**. |
|
- **[Aug 9, 2025]**: ๐ฅ We released our **[๐คfull reasonrank training data (13k)](https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k)**, **[๐คcold-start SFT data](https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft)** and **[๐คRL data](https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl)**. |
|
- **[Aug 9, 2025]**: ๐ฅ We released our reasoning-intensive reranker **[๐คreasonrank-7B](https://huggingface.co/liuwenhan/reasonrank-7B)** and **[๐คreasonrank-32B](https://huggingface.co/liuwenhan/reasonrank-32B)**. |
|
- **[Aug 9, 2025]**: ๐ We released our full codebase, including inference, SFT training, and RL training. |
|
|
|
## 1. ReasonRank |
|
|
|
### ๐ก 1.1 Overview |
|
|
|
**ReasonRank** is a **reasoning-intensive passage reranker** tailored for reasoning-intensive ranking tasks. To train it, we first design an automated reasoning-intensive training data synthesis framework and synthesize 1.3k high-quality training data. |
|
|
|
<p align="center"> |
|
<img width="80%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809002302377.png" /> |
|
</p> |
|
|
|
Based on the training data, we design a two-stage training approach including **cold-start SFT** and **multi-view ranking reward RL** to inject listwise ranking ability to our ReasonRank. |
|
|
|
<p align="center"> |
|
<img width="80%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809002546838.png" /> |
|
</p> |
|
|
|
### ๐ 1.2 Overall Performance |
|
|
|
When using ReasonIR as initial passage retriever, our ReasonRank demonstrates strong overall ranking performance on BRIGHT benchmark, while showing superior efficiency compared with pointwise reasoning-intensive reranker Rank1. |
|
|
|
<p align="center"> |
|
<img width="50%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809003636871.png" /> |
|
</p> |
|
|
|
Besides, when using a higher-quality retrieval results (RaDeR + BM25 hybrid, provided by [RaDeR](https://github.com/Debrup-61/RaDeR/blob/main/BRIGHT_score_files/RaDeR-gte-Qwen2-LLMq_CoT_lexical/aops/hybrid_BM25_Rader.json)), our ReasonRank (32B) achieves SOTA performance **40.8** on [BRIGHT leaderboard](https://brightbenchmark.github.io/). |
|
|
|
## ๐ 2. The Introduction of ReasonRank Training Data |
|
|
|
An important contribution of our work is our reasoning-intensive training data ([reasonrank_data_13k](https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k)). The dataset fields of ``training_data_all.jsonl`` are as follows: |
|
|
|
#### **Dataset Fields & Descriptions** |
|
|
|
1. **`dataset`** *(str)* |
|
- The dataset name of each piece of data (e.g., `"math-qa"`). |
|
2. **`qid`** *(str)* |
|
- The query ID. The content is provided in ``id_query/`` directory. |
|
3. **`initial_list`** *(List[str])* |
|
- The initial list of passage IDs before DeepSeek-R1 reranking. The content of each passage ID is provided in ``id_doc/`` directory. |
|
4. **`final_list`** *(List[str])* |
|
- The re-ranked list of passage IDs after listwisely reranking with DeepSeek-R1. |
|
- Reflects the improved ranking based on reasoning-enhanced relevance scoring. |
|
5. **`reasoning`** *(str)* |
|
- A **step-by-step reasoning chain** outputted by DeepSeek-R1 while performing the listwise reranking. |
|
6. **`relevant_docids`** *(List[str])* |
|
- The ids of relevant passages in ``initial_list`` mined by DeepSeek-R1. The remaining passage ids in ``initial_list`` are irrelevant ones. |
|
- Note that **`relevant_docids`** are not necessarily ranked at the top of **`final_list`** by the DeepSeek-R1, which may stem from inconsistencies in DeepSeek-R1โs judgments. To address this, you can apply the **self-consistency data filtering** technique proposed in our paper to select higher-quality data. |
|
|
|
The statistics of dataset is shown in the figure below: |
|
<p align="center"> |
|
<img width="80%" alt="image" src="https://github.com/user-attachments/assets/c04b9d1a-2f21-46f1-b23d-ad1f50d22fb8" /> |
|
</p> |
|
|
|
#### **Example Entry** |
|
|
|
```json |
|
{ |
|
"dataset": "math-qa", |
|
"qid": "math_1001", |
|
"initial_list": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", ...], |
|
"final_list": ["math_test_intermediate_algebra_808", "math_test_intermediate_algebra_1678", ...], |
|
"reasoning": "Okay, I need to rank the 20 passages based on their relevance...", |
|
"relevant_docids": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", "math_train_intermediate_algebra_993"] |
|
} |
|
``` |
|
|
|
#### **Application** |
|
|
|
1. Training passage reranker: Given the reranked passage list, one can use our data to train a listwise reranker |
|
2. Training passage retriever: Using the **`relevant_docids`** and the remaining irrelevant ids, one can train a passage retriever. |
|
|
|
## โก 3. Quick Start |
|
|
|
This section provides a general guide on how to use ReasonRank for inference. For detailed environment setup, specific inference commands (including usage with ReasonIR or custom retrieval results), and in-depth training procedures (Cold-Start SFT, Multi-reward ranking RL), please refer to the [official GitHub repository](https://github.com/8421BCD/ReasonRank). |
|
|
|
## Sample Usage |
|
|
|
This model can be loaded and used with the `transformers` library. Below is a basic example demonstrating how to use the model for passage re-ranking. The model expects a specific chat-like format for input, including a system prompt and a user query with listed passages. |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "liuwenhan/reasonrank-7B" # Or "liuwenhan/reasonrank-32B" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
trust_remote_code=True |
|
).eval() |
|
|
|
# System prompt as used in the paper for reasoning-intensive ranking |
|
system_prompt = ( |
|
"You are a helpful and harmless AI assistant. You will be provided with a search query and a list of passages, " |
|
"and your task is to re-rank the passages based on their relevance to the query. " |
|
"You should follow a chain of thought to determine the most relevant passages. " |
|
"Your final answer should be a list of the re-ranked passage numbers, separated by commas. " |
|
"Do not include any other information or explanation in your final answer." |
|
) |
|
|
|
query = "What is the capital of France?" |
|
passages = [ |
|
"Paris is the capital and most populous city of France.", |
|
"The Eiffel Tower is a famous landmark in Paris.", |
|
"France is a country located in Western Europe.", |
|
"London is the capital of the United Kingdom." |
|
] |
|
|
|
# Construct the user message with query and passages |
|
user_content = f"Search Query: {query} |
|
" |
|
for i, passage in enumerate(passages): |
|
user_content += f"[{i+1}] {passage} |
|
" |
|
user_content += "Please re-rank the passages based on their relevance to the query. Provide a chain of thought and then the final re-ranked list." |
|
|
|
messages = [ |
|
{"role": "system", "content": system_prompt}, |
|
{"role": "user", "content": user_content} |
|
] |
|
|
|
# Apply chat template |
|
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
|
|
# Tokenize input |
|
input_ids = tokenizer.encode(input_text, return_tensors="pt").to(model.device) |
|
|
|
# Generate response |
|
output_ids = model.generate( |
|
input_ids, |
|
max_new_tokens=256, # Adjust as needed for reasoning length |
|
do_sample=False, # Typically deterministic for ranking/reasoning |
|
temperature=0.1, # Low temperature for focused output |
|
repetition_penalty=1.05, |
|
eos_token_id=tokenizer.eos_token_id |
|
) |
|
|
|
# Decode output |
|
generated_text = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True) |
|
print(generated_text) |
|
``` |
|
|
|
## Citation |
|
|
|
If you find this work helpful, please cite our papers: |
|
|
|
```bibtex |
|
@misc{liu2025reasonrankempoweringpassageranking, |
|
title={ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability}, |
|
author={Wenhan Liu and Xinyu Ma and Weiwei Sun and Yutao Zhu and Yuchen Li and Dawei Yin and Zhicheng Dou}, |
|
year={2025}, |
|
eprint={2508.07050}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.IR}, |
|
url={https://arxiv.org/abs/2508.07050}, |
|
} |
|
``` |
|
|
|
## ๐ค Acknowledge |
|
|
|
The inference codes and training implementation build upon [RankLLM](https://github.com/castorini/rank_llm), [Llama Factory](https://github.com/hiyouga/LLaMA-Factory) and [verl](https://github.com/volcengine/verl). Our work is based on the [Qwen2.5](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) model series, and we sincerely thank the Qwen team for their outstanding contributions to the open-source community. |
|
|
|
## ๐ License |
|
|
|
This project is released under the [MIT License](LICENSE). |
|
|
|
## ๐ Contact |
|
|
|
For any questions or feedback, please reach out to us at [lwh@ruc.edu.cn](lwh@ruc.edu.cn). |
|
|
|
## Star History |
|
|
|
[](https://www.star-history.com/#8421bcd/reasonrank&Date) |