cycloneboy
/

SLM-SQL-Base-1B

Text Generation

reinforcement-learning

text-generation-inference

Model card Files Files and versions

Add model card for SLM-SQL

#1

by nielsr HF Staff - opened Jul 31

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +46 -3

README.md CHANGED Viewed

@@ -1,3 +1,46 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+pipeline_tag: text-generation
+library_name: transformers
+---
+# SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
+The `SLM-SQL` model, presented in the paper [SLM-SQL: An Exploration of Small Language Models for Text-to-SQL](https://huggingface.co/papers/2507.22478), explores the potential of Small Language Models (SLMs) in translating natural language questions into SQL queries (Text-to-SQL).
+Unlike traditional Large Language Models (LLMs) which have demonstrated strong performance in Text-to-SQL, SLMs (ranging from 0.5B to 1.5B parameters) currently underperform but offer significant advantages in inference speed and suitability for edge deployment. This work addresses their limitations by leveraging recent advancements in post-training techniques.
+Specifically, the authors utilized the open-source SynSQL-2.5M dataset to construct two derived datasets: SynSQL-Think-916K for SQL generation and SynSQL-Merge-Think-310K for SQL merge revision. They applied supervised fine-tuning and reinforcement learning-based post-training to the SLMs, followed by inference using a corrective self-consistency approach.
+Experimental results validate the effectiveness and generalizability of the `SLM-SQL` method. On the BIRD development set, the evaluated models achieved an average improvement of 31.4 points. Notably, the 0.5B model reached 56.87% execution accuracy (EX), while the 1.5B model achieved 67.08% EX.
+The authors plan to release their dataset, model, and code on GitHub.
+## Usage
+This model is designed for Text-to-SQL generation. While the official code and specific usage instructions are planned for release on GitHub, you would typically load such a model using the `transformers` library. The exact prompt format and required inputs (e.g., database schema) will be detailed in the official repository.
+```python
+# from transformers import AutoModelForCausalLM, AutoTokenizer
+# # Replace "your_model_name" with the specific SLM-SQL model once it's available on the Hub.
+# model_name = "your_organization/slm-sql-model-name"
+# tokenizer = AutoTokenizer.from_pretrained(model_name)
+# model = AutoModelForCausalLM.from_pretrained(model_name)
+# # Example for a Text-to-SQL task (prompting may vary significantly)
+# # input_natural_language_query = "What are the names of all employees in the 'Engineering' department?"
+# # Assuming a prompt template like: "Translate to SQL: [NL_QUERY] [SCHEMA_INFO]"
+# # prompt = f"Translate the following natural language query into SQL:\
+{input_natural_language_query}\
+SQL:"
+# # encoded_input = tokenizer(prompt, return_tensors="pt")
+# # output_ids = model.generate(**encoded_input, max_new_tokens=256) # Adjust max_new_tokens as needed for SQL
+# # decoded_output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
+# # print(decoded_output)
+# # Please refer to the official repository (once released) for detailed usage instructions,
+# # including specific prompt formats and handling of database schemas.
+```