Update README.md

a8875b9 verified 3 months ago

8 kB

	---
	license: apache-2.0
	---
	### Important Links

	📖[Github](https://github.com/XGenerationLab/XiYanSQL-QwenCoder) \|
	🤖[ModelScope](https://modelscope.cn/collections/XiYanSQL-Models-4483337b614241) \|
	🌐[XiYan-SQL](https://github.com/XGenerationLab/XiYan-SQL) \|
	🌕[析言GBI](https://bailian.console.aliyun.com/xiyan) \|
	💻[ModelScope Space](https://www.modelscope.cn/studios/XGenerationLab/XiYanSQL-QwenCoder-32B)


	## Introduction
	We are excited to update our new XiYanSQL-QwenCoder series model, demonstrating improvements over its predecessor in some key features.
	- The new XiYanSQL-QwenCoder model applies the merits of GRPO training strategy without thinking process, maintaining high efficiency and accuracy in SQL generation.
	- The new XiYanSQL-QwenCoder model keeps its great performance in various benchmarks, including BIRD, Spider and DW benchmarks which will be released in the future.
	- The new XiYanSQL-QwenCoder model demonstrates better generalization than its predecessor, especially in different dialects and out-of-domain datasets.


	## Model Downloads


	\| Model \| Download Latest \|
	\|-----------\|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\|XiYanSQL-QwenCoder-7B \| 🤗[HuggingFace](https://huggingface.co/XGenerationLab/XiYanSQL-QwenCoder-7B-2504) 🤖[Modelscope](https://www.modelscope.cn/models/XGenerationLab/XiYanSQL-QwenCoder-7B-2504) \|
	\|XiYanSQL-QwenCoder-32B \| 🤗[HuggingFace](https://huggingface.co/XGenerationLab/XiYanSQL-QwenCoder-32B-2504) 🤖[Modelscope](https://www.modelscope.cn/models/XGenerationLab/XiYanSQL-QwenCoder-32B-2504) \|



	## Performance
	The XiYanSQL-QwenCoder models, as multi-dialect SQL base models, demonstrating robust SQL generation capabilities. The following presents the evaluation results at the time of release. We conducted a comprehensive evaluation of the model's performance under two schema formats, M-Schema, and original DDL, using the BIRD and Spider as SQLite benchmarks in the Text-to-SQL domain, as well as DW benchmarks for PostgreSQL and MySQL dialects.

	\| Model name \| Size \| BIRD Dev@M-Schema \| BIRD Dev@DDL \| Spider Test@M-Schema \| Spider Test@DDL \| DW PostgreSQL@M-Schema \| DW MySQL@M-Schema \|
	\|------------------------------\|:------:\|:-----------------:\|:------------:\|:--------------------:\|:---------------:\|:----------------------:\|:-----------------:\|
	\| GPT-4o-0806 \| UNK \| 58.47% \| 54.82% \| 82.89% \| 78.45% \| 46.79% \| 57.77% \|
	\| GPT-4.1-0414 \| UNK \| 59.39% \| 54.11% \| 84.45% \| 79.86% \| 54.29% \| 63.18% \|
	\| Claude3.5-sonnet-1022 \| UNK \| 53.32% \| 50.46% \| 76.27% \| 73.04% \| 55.22% \| 52.84% \|
	\| Claude3.7-sonnet \| UNK \| 54.82% \| 49.22% \| 78.04% \| 74.66% \| 53.23% \| 54.61% \|
	\| Gemini-1.5-Pro \| UNK \| 61.34% \| 57.89% \| 85.11% \| 84.00% \| 52.78% \| 62.78% \|
	\| Gemini-2.5-Pro \| UNK \| 67.21% \| 63.43% \| 88.29% \| 86.27% \| 63.16% \| 65.37% \|
	\| DeepSeek-V2.5-1210 \| 236B \| 55.74% \| 55.61% \| 82.08% \| 80.57% \| 45.74% \| 52.18% \|
	\| DeepSeek-V3 \| 685B \| 59.58% \| 56.71% \| 81.52% \| 79.91% \| 52.56% \| 55.95% \|
	\| DeepSeek-R1 \| 685B \| 58.15% \| 55.61% \| 80.72% \| 78.85% \| 60.56% \| xx% \|
	\| DeepSeek-R1-Distill-Qwen-32B \| 32B \| 50.65% \| 48.31% \| 78.65% \| 77.33% \| 37.22% \| 44.72% \|
	\| Deepseek-Coder-33B-Instruct \| 33B \| 47.52% \| 44.72% \| 72.39% \| xx% \| 31.48% \| 36.17% \|
	\| OmniSQL-32B \| 32B \| 60.37% \| 55.87% \| 85.16% \| 83.19% \| 38.19% \| 42.34% \|
	\| XiYanSQL-QwenCoder-32B-2412 \| 32B \| 67.07% \| 63.04% \| 88.39% \| 85.46% \| 45.07% \| 52.84% \|
	\| XiYanSQL-QwenCoder-32B-2504 \| 32B \| 67.14% \| 62.26% \| 89.20% \| 86.17% \| 53.52% \| 57.74% \|


	## Requirements

	transformers >= 4.37.0
	vllm >= 0.7.2

	## Quickstart with Transformers and vLLM

	Here is a simple code snippet for quickly using XiYanSQL-QwenCoder model. We provide a Chinese version of the prompt, and you just need to replace the placeholders for "question," "db_schema," and "evidence" to get started. We recommend using our [M-Schema](https://github.com/XGenerationLab/M-Schema) format for the schema; other formats such as DDL are also acceptable, but they may affect performance.
	Currently, we mainly support mainstream dialects like SQLite, PostgreSQL, and MySQL.

	### Prompt Template
	```python
	nl2sqlite_template_cn = """你是一名{dialect}专家，现在需要阅读并理解下面的【数据库schema】描述，以及可能用到的【参考信息】，并运用{dialect}知识生成sql语句回答【用户问题】。
	【用户问题】
	{question}

	【数据库schema】
	{db_schema}

	【参考信息】
	{evidence}

	【用户问题】
	{question}

	```sql"""
	```


	### Inference with Transformers
	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "XGenerationLab/XiYanSQL-QwenCoder-32B-2502"
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)

	tokenizer = AutoTokenizer.from_pretrained(model_name)

	## dialects -> ['SQLite', 'PostgreSQL', 'MySQL']
	prompt = nl2sqlite_template_cn.format(dialect="", db_schema="", question="", evidence="")
	message = [{'role': 'user', 'content': prompt}]

	text = tokenizer.apply_chat_template(
	message,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	pad_token_id=tokenizer.pad_token_id,
	eos_token_id=tokenizer.eos_token_id,
	max_new_tokens=1024,
	temperature=0.1,
	top_p=0.8,
	do_sample=True,
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]
	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	### Inference with vLLM
	```python
	from vllm import LLM, SamplingParams
	from transformers import AutoTokenizer
	model_path = "XGenerationLab/XiYanSQL-QwenCoder-32B-2502"
	llm = LLM(model=model_path, tensor_parallel_size=8)
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	sampling_params = SamplingParams(
	n=1,
	temperature=0.1,
	max_tokens=2048
	)

	## dialects -> ['SQLite', 'PostgreSQL', 'MySQL']
	prompt = nl2sqlite_template_cn.format(dialect="", db_schema="", question="", evidence="")
	message = [{'role': 'user', 'content': prompt}]
	text = tokenizer.apply_chat_template(
	message,
	tokenize=False,
	add_generation_prompt=True
	)
	outputs = llm.generate([text], sampling_params=sampling_params)
	response = outputs[0].outputs[0].text
	```


	## Acknowledgments
	If you find our work useful, please give us a citation or a like, so we can make a greater contribution to the open-source community!