metadata

license: apache-2.0
language:
  - en
base_model:
  - Qwen/Qwen2.5-Coder-3B-Instruct
tags:
  - text-to-sql
  - fine-tuned
  - qwen
pipeline_tag: text-generation

This is a fine-tuned version of Qwen/Qwen2.5-Coder-3B-Instruct for generating SQL queries from natural language questions. The model was fine-tuned using LoRA (r=16) on a subset of the Spider dataset and merged into a standalone model, eliminating the need for the peft library during inference. Usage To use the model for SQL query generation: from transformers import AutoModelForCausalLM, AutoTokenizer import torch

Load model and tokenizer

model_name = "Piyush026/Qwen2.5-Coder-3B-sql-finetuned" # Replace with your repo ID tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto", trust_remote_code=True )

Generate SQL query Example

prompt = """ Database: university Schema:

students: [student_id, first_name, last_name, department_code, gpa, major]
departments: [department_code, department_name]
courses: [course_number, course_title, professor_id]
instructors: [professor_id, last_name] Question: List all students. """ inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_length=200) print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Base Model: Qwen/Qwen2.5-Coder-3B-Instruct Fine-Tuning: LoRA (r=16, lora_alpha=32, lora_dropout=0.05) on a 1000-sample subset of the Spider dataset. Environment: Lightning AI Studio with Tesla T4 GPU. Merged Model: The LoRA adapters were merged into the base model using merge_and_unload for standalone inference.