Enhance model card with paper, code links and usage example

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +84 -14
README.md CHANGED
@@ -1,20 +1,21 @@
1
  ---
2
- pipeline_tag: text-generation
3
  library_name: transformers
4
  license: cc-by-nc-4.0
 
5
  tags:
6
  - text-to-sql
7
  - reinforcement-learning
8
  ---
9
 
10
-
11
  # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
12
 
13
  ### Important Links
14
 
15
- πŸ“–[Arxiv Paper](https://arxiv.org/abs/2507.22478) |
16
- πŸ€—[HuggingFace](https://huggingface.co/collections/cycloneboy/slm-sql-688b02f99f958d7a417658dc) |
17
- πŸ€–[ModelScope](https://modelscope.cn/collections/SLM-SQL-624bb6a60e9643) |
 
 
18
 
19
  ## News
20
 
@@ -36,24 +37,94 @@ tags:
36
  > and generalizability of our method, SLM-SQL. On the BIRD development set, the five evaluated models achieved an
37
  > average
38
  > improvement of 31.4 points. Notably, the 0.5B model reached 56.87\% execution accuracy (EX), while the 1.5B model
39
- > achieved 67.08\% EX. We will release our dataset, model, and code to github: https://github.com/CycloneBoy/slm_sql.
40
 
41
  ### Framework
42
 
43
- <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_framework.png" height="500" alt="slmsql_framework">
44
 
45
  ### Main Results
46
 
47
- <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_result.png" height="500" alt="slm_sql_result">
48
 
 
49
 
50
- <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_main.png" height="500" alt="slmsql_bird_main">
51
-
52
- <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_spider_main.png" height="500" alt="slmsql_spider_main">
53
 
54
  Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset.
55
 
56
- <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_ablation_study.png" height="300" alt="slmsql_ablation_study">
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
  ## Model
59
 
@@ -115,5 +186,4 @@ Performance Comparison of different Text-to-SQL methods on BIRD dev and test dat
115
  archivePrefix={arXiv},
116
  primaryClass={cs.CL},
117
  url={https://arxiv.org/abs/2505.13271},
118
- }
119
- ```
 
1
  ---
 
2
  library_name: transformers
3
  license: cc-by-nc-4.0
4
+ pipeline_tag: text-generation
5
  tags:
6
  - text-to-sql
7
  - reinforcement-learning
8
  ---
9
 
 
10
  # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
11
 
12
  ### Important Links
13
 
14
+ πŸ“–[Hugging Face Paper](https://huggingface.co/papers/2507.22478) |
15
+ πŸ“š[arXiv Paper](https://arxiv.org/abs/2507.22478) |
16
+ πŸ’»[GitHub Repository](https://github.com/CycloneBoy/slm_sql) |
17
+ πŸ€—[Hugging Face Models Collection](https://huggingface.co/collections/cycloneboy/slm-sql-688b02f99f958d7a417658dc) |
18
+ πŸ€–[ModelScope Models Collection](https://modelscope.cn/collections/SLM-SQL-624bb6a60e9643) |
19
 
20
  ## News
21
 
 
37
  > and generalizability of our method, SLM-SQL. On the BIRD development set, the five evaluated models achieved an
38
  > average
39
  > improvement of 31.4 points. Notably, the 0.5B model reached 56.87\% execution accuracy (EX), while the 1.5B model
40
+ > achieved 67.08\% EX.
41
 
42
  ### Framework
43
 
44
+ <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_framework.png" height="500" alt="slmsql_framework">
45
 
46
  ### Main Results
47
 
48
+ <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_result.png" height="500" alt="slm_sql_result">
49
 
50
+ <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_main.png" height="500" alt="slmsql_bird_main">
51
 
52
+ <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_spider_main.png" height="500" alt="slmsql_spider_main">
 
 
53
 
54
  Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset.
55
 
56
+ <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_ablation_study.png" height="300" alt="slm_sql_ablation_study">
57
+
58
+ ## Usage
59
+
60
+ This model can be used with the Hugging Face `transformers` library for text-to-SQL generation.
61
+
62
+ ```python
63
+ import torch
64
+ from transformers import AutoTokenizer, AutoModelForCausalLM
65
+
66
+ # Load model and tokenizer
67
+ # Replace "cycloneboy/SLM-SQL-0.5B" with the specific model checkpoint you want to use.
68
+ model_id = "cycloneboy/SLM-SQL-0.5B"
69
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
70
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True)
71
+
72
+ # Set the model to evaluation mode
73
+ model.eval()
74
+
75
+ # Define the natural language question and database schema (replace with your data)
76
+ user_query = "What are the names of all employees who earn more than 50000?"
77
+ database_schema = """
78
+ CREATE TABLE employees (
79
+ employee_id INT PRIMARY KEY,
80
+ name VARCHAR(255),
81
+ salary DECIMAL(10, 2)
82
+ );
83
+ """
84
+
85
+ # Construct the conversation using the model's chat template
86
+ # The model expects schema and question to generate the SQL query.
87
+ # The prompt format below is a common way to combine schema and question for Text-to-SQL.
88
+ full_prompt = f"""
89
+ You are a Text-to-SQL model.
90
+ Given the following database schema:
91
+ {database_schema}
92
+ Generate the SQL query for the question:
93
+ {user_query}
94
+ """
95
+
96
+ messages = [
97
+ {"role": "user", "content": full_prompt.strip()}
98
+ ]
99
+
100
+ # Apply the chat template and tokenize inputs
101
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
102
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
103
+
104
+ # Generate the SQL query
105
+ with torch.no_grad():
106
+ outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.6, top_p=0.9, do_sample=True,
107
+ eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|im_end|>")])
108
+
109
+ # Decode the generated text and extract the assistant's response
110
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=False)
111
+ # The Qwen-style chat template wraps assistant's response between <|im_start|>assistant
112
+ and <|im_end|>
113
+ assistant_prefix = "<|im_start|>assistant\
114
+ "
115
+ if assistant_prefix in generated_text:
116
+ sql_query = generated_text.split(assistant_prefix, 1)[1].strip()
117
+ # Remove any trailing special tokens like <|im_end|>
118
+ sql_query = sql_query.split("<|im_end|>", 1)[0].strip()
119
+ else:
120
+ sql_query = generated_text # Fallback in case prompt format differs unexpectedly
121
+
122
+ print(f"User Query: {user_query}
123
+ Generated SQL: {sql_query}")
124
+
125
+ # Example of a potential output for the given query and schema:
126
+ # Generated SQL: SELECT name FROM employees WHERE salary > 50000;
127
+ ```
128
 
129
  ## Model
130
 
 
186
  archivePrefix={arXiv},
187
  primaryClass={cs.CL},
188
  url={https://arxiv.org/abs/2505.13271},
189
+ }