Add model card for SLM-SQL

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +46 -3
README.md CHANGED
@@ -1,3 +1,46 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
+ ---
6
+
7
+ # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
8
+
9
+ The `SLM-SQL` model, presented in the paper [SLM-SQL: An Exploration of Small Language Models for Text-to-SQL](https://huggingface.co/papers/2507.22478), explores the potential of Small Language Models (SLMs) in translating natural language questions into SQL queries (Text-to-SQL).
10
+
11
+ Unlike traditional Large Language Models (LLMs) which have demonstrated strong performance in Text-to-SQL, SLMs (ranging from 0.5B to 1.5B parameters) currently underperform but offer significant advantages in inference speed and suitability for edge deployment. This work addresses their limitations by leveraging recent advancements in post-training techniques.
12
+
13
+ Specifically, the authors utilized the open-source SynSQL-2.5M dataset to construct two derived datasets: SynSQL-Think-916K for SQL generation and SynSQL-Merge-Think-310K for SQL merge revision. They applied supervised fine-tuning and reinforcement learning-based post-training to the SLMs, followed by inference using a corrective self-consistency approach.
14
+
15
+ Experimental results validate the effectiveness and generalizability of the `SLM-SQL` method. On the BIRD development set, the evaluated models achieved an average improvement of 31.4 points. Notably, the 0.5B model reached 56.87% execution accuracy (EX), while the 1.5B model achieved 67.08% EX.
16
+
17
+ The authors plan to release their dataset, model, and code on GitHub.
18
+
19
+ ## Usage
20
+
21
+ This model is designed for Text-to-SQL generation. While the official code and specific usage instructions are planned for release on GitHub, you would typically load such a model using the `transformers` library. The exact prompt format and required inputs (e.g., database schema) will be detailed in the official repository.
22
+
23
+ ```python
24
+ # from transformers import AutoModelForCausalLM, AutoTokenizer
25
+
26
+ # # Replace "your_model_name" with the specific SLM-SQL model once it's available on the Hub.
27
+ # model_name = "your_organization/slm-sql-model-name"
28
+
29
+ # tokenizer = AutoTokenizer.from_pretrained(model_name)
30
+ # model = AutoModelForCausalLM.from_pretrained(model_name)
31
+
32
+ # # Example for a Text-to-SQL task (prompting may vary significantly)
33
+ # # input_natural_language_query = "What are the names of all employees in the 'Engineering' department?"
34
+ # # Assuming a prompt template like: "Translate to SQL: [NL_QUERY] [SCHEMA_INFO]"
35
+ # # prompt = f"Translate the following natural language query into SQL:\
36
+ {input_natural_language_query}\
37
+ SQL:"
38
+
39
+ # # encoded_input = tokenizer(prompt, return_tensors="pt")
40
+ # # output_ids = model.generate(**encoded_input, max_new_tokens=256) # Adjust max_new_tokens as needed for SQL
41
+ # # decoded_output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
42
+ # # print(decoded_output)
43
+
44
+ # # Please refer to the official repository (once released) for detailed usage instructions,
45
+ # # including specific prompt formats and handling of database schemas.
46
+ ```