File size: 7,995 Bytes
59c7ef4 cf63462 59c7ef4 a8875b9 59c7ef4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
---
license: apache-2.0
---
### Important Links
📖[Github](https://github.com/XGenerationLab/XiYanSQL-QwenCoder) |
🤖[ModelScope](https://modelscope.cn/collections/XiYanSQL-Models-4483337b614241) |
🌐[XiYan-SQL](https://github.com/XGenerationLab/XiYan-SQL) |
🌕[析言GBI](https://bailian.console.aliyun.com/xiyan) |
💻[ModelScope Space](https://www.modelscope.cn/studios/XGenerationLab/XiYanSQL-QwenCoder-32B)
## Introduction
We are excited to update our new XiYanSQL-QwenCoder series model, demonstrating improvements over its predecessor in some key features.
- The new XiYanSQL-QwenCoder model applies the merits of GRPO training strategy without thinking process, maintaining high efficiency and accuracy in SQL generation.
- The new XiYanSQL-QwenCoder model keeps its great performance in various benchmarks, including BIRD, Spider and DW benchmarks which will be released in the future.
- The new XiYanSQL-QwenCoder model demonstrates better generalization than its predecessor, especially in different dialects and out-of-domain datasets.
## Model Downloads
| **Model** | **Download Latest** |
|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|XiYanSQL-QwenCoder-7B | 🤗[HuggingFace](https://huggingface.co/XGenerationLab/XiYanSQL-QwenCoder-7B-2504) 🤖[Modelscope](https://www.modelscope.cn/models/XGenerationLab/XiYanSQL-QwenCoder-7B-2504) |
|XiYanSQL-QwenCoder-32B | 🤗[HuggingFace](https://huggingface.co/XGenerationLab/XiYanSQL-QwenCoder-32B-2504) 🤖[Modelscope](https://www.modelscope.cn/models/XGenerationLab/XiYanSQL-QwenCoder-32B-2504) |
## Performance
The XiYanSQL-QwenCoder models, as multi-dialect SQL base models, demonstrating robust SQL generation capabilities. The following presents the evaluation results at the time of release. We conducted a comprehensive evaluation of the model's performance under two schema formats, M-Schema, and original DDL, using the BIRD and Spider as SQLite benchmarks in the Text-to-SQL domain, as well as DW benchmarks for PostgreSQL and MySQL dialects.
| Model name | Size | BIRD Dev@M-Schema | BIRD Dev@DDL | Spider Test@M-Schema | Spider Test@DDL | DW PostgreSQL@M-Schema | DW MySQL@M-Schema |
|------------------------------|:------:|:-----------------:|:------------:|:--------------------:|:---------------:|:----------------------:|:-----------------:|
| GPT-4o-0806 | UNK | 58.47% | 54.82% | 82.89% | 78.45% | 46.79% | 57.77% |
| GPT-4.1-0414 | UNK | 59.39% | 54.11% | 84.45% | 79.86% | 54.29% | 63.18% |
| Claude3.5-sonnet-1022 | UNK | 53.32% | 50.46% | 76.27% | 73.04% | 55.22% | 52.84% |
| Claude3.7-sonnet | UNK | 54.82% | 49.22% | 78.04% | 74.66% | 53.23% | 54.61% |
| Gemini-1.5-Pro | UNK | 61.34% | 57.89% | 85.11% | 84.00% | 52.78% | 62.78% |
| Gemini-2.5-Pro | UNK | 67.21% | 63.43% | 88.29% | 86.27% | 63.16% | 65.37% |
| DeepSeek-V2.5-1210 | 236B | 55.74% | 55.61% | 82.08% | 80.57% | 45.74% | 52.18% |
| DeepSeek-V3 | 685B | 59.58% | 56.71% | 81.52% | 79.91% | 52.56% | 55.95% |
| DeepSeek-R1 | 685B | 58.15% | 55.61% | 80.72% | 78.85% | 60.56% | xx% |
| DeepSeek-R1-Distill-Qwen-32B | 32B | 50.65% | 48.31% | 78.65% | 77.33% | 37.22% | 44.72% |
| Deepseek-Coder-33B-Instruct | 33B | 47.52% | 44.72% | 72.39% | xx% | 31.48% | 36.17% |
| OmniSQL-32B | 32B | 60.37% | 55.87% | 85.16% | 83.19% | 38.19% | 42.34% |
| XiYanSQL-QwenCoder-32B-2412 | 32B | 67.07% | 63.04% | 88.39% | 85.46% | 45.07% | 52.84% |
| XiYanSQL-QwenCoder-32B-2504 | 32B | 67.14% | 62.26% | 89.20% | 86.17% | 53.52% | 57.74% |
## Requirements
transformers >= 4.37.0
vllm >= 0.7.2
## Quickstart with Transformers and vLLM
Here is a simple code snippet for quickly using **XiYanSQL-QwenCoder** model. We provide a Chinese version of the prompt, and you just need to replace the placeholders for "question," "db_schema," and "evidence" to get started. We recommend using our [M-Schema](https://github.com/XGenerationLab/M-Schema) format for the schema; other formats such as DDL are also acceptable, but they may affect performance.
Currently, we mainly support mainstream dialects like SQLite, PostgreSQL, and MySQL.
### Prompt Template
```python
nl2sqlite_template_cn = """你是一名{dialect}专家,现在需要阅读并理解下面的【数据库schema】描述,以及可能用到的【参考信息】,并运用{dialect}知识生成sql语句回答【用户问题】。
【用户问题】
{question}
【数据库schema】
{db_schema}
【参考信息】
{evidence}
【用户问题】
{question}
```sql"""
```
### Inference with Transformers
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "XGenerationLab/XiYanSQL-QwenCoder-32B-2502"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
## dialects -> ['SQLite', 'PostgreSQL', 'MySQL']
prompt = nl2sqlite_template_cn.format(dialect="", db_schema="", question="", evidence="")
message = [{'role': 'user', 'content': prompt}]
text = tokenizer.apply_chat_template(
message,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
max_new_tokens=1024,
temperature=0.1,
top_p=0.8,
do_sample=True,
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
### Inference with vLLM
```python
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
model_path = "XGenerationLab/XiYanSQL-QwenCoder-32B-2502"
llm = LLM(model=model_path, tensor_parallel_size=8)
tokenizer = AutoTokenizer.from_pretrained(model_path)
sampling_params = SamplingParams(
n=1,
temperature=0.1,
max_tokens=2048
)
## dialects -> ['SQLite', 'PostgreSQL', 'MySQL']
prompt = nl2sqlite_template_cn.format(dialect="", db_schema="", question="", evidence="")
message = [{'role': 'user', 'content': prompt}]
text = tokenizer.apply_chat_template(
message,
tokenize=False,
add_generation_prompt=True
)
outputs = llm.generate([text], sampling_params=sampling_params)
response = outputs[0].outputs[0].text
```
## Acknowledgments
If you find our work useful, please give us a citation or a like, so we can make a greater contribution to the open-source community! |