--- license: apache-2.0 tags: - smollm - python - code-generation - instruct - qlora - fine-tuned - code - nf4 datasets: - flytech/python-codes-25k model-index: - name: HF-SmolLM-1.7B-0.5B-4bit-coder results: [] language: - en pipeline_tag: text-generation --- # HF-SmolLM-1.7B-0.5B-4bit-coder ## Model Summary **HF-SmolLM-1.7B-0.5B-4bit-coder** is a fine-tuned variant of [SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B), optimized for **instruction-following in Python code generation tasks**. It was trained on a **1,500-sample subset** of the [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k) dataset using **parameter-efficient fine-tuning (QLoRA 4-bit)**. The model is suitable for: - Generating Python code snippets from natural language instructions - Completing short code functions - Educational prototyping of fine-tuned LMs ⚠️ This is **not a production-ready coding assistant**. Generated outputs must be manually reviewed before execution. --- ## Intended Uses & Limitations ### ✅ Intended - Research on parameter-efficient fine-tuning - Educational demos of instruction-tuning workflows - Prototype code generation experiments ### ❌ Not Intended - Deployment in production coding assistants - Safety-critical applications - Long-context multi-file programming tasks --- ## Training Details ### Base Model - **Name:** [HuggingFaceTB/SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B) - **Architecture:** Decoder-only causal LM - **Total Parameters:** 1.72B - **Fine-tuned Trainable Parameters:** ~9M (0.53%) ### Dataset - **Source:** [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k) - **Subset Used:** 1,500 randomly sampled examples - **Content:** Instruction + optional input → Python code output - **Formatting:** Converted into `chat` format with `user` / `assistant` roles ### Training Procedure - **Framework:** Hugging Face Transformers + TRL (SFTTrainer) - **Quantization:** 4-bit QLoRA (nf4) with bfloat16 compute when available - **Effective Batch Size:** 6 (with accumulation) - **Optimizer:** AdamW - **Scheduler:** Cosine decay with warmup ratio 0.05 - **Epochs:** 3 - **Learning Rate:** 2e-4 - **Max Seq Length:** 64 tokens (training) - **Mixed Precision:** FP16 - **Gradient Checkpointing:** Enabled --- ## Evaluation No formal benchmark evaluation has been conducted yet. Empirically, the model: - Produces syntactically valid Python code for simple tasks - Adheres to given instructions with reasonable accuracy - Struggles with multi-step reasoning and long code outputs --- ## Example Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer repo = "sweatSmile/HF-SmolLM-1.7B-0.5B-4bit-coder" tokenizer = AutoTokenizer.from_pretrained(repo) model = AutoModelForCausalLM.from_pretrained(repo, device_map="auto") prompt = "Write a Python function that checks if a number is prime." inputs = tokenizer.apply_chat_template( [{"role": "user", "content": prompt}], return_tensors="pt", add_generation_prompt=True ).to(model.device) outputs = model.generate(inputs, max_new_tokens=150) print(tokenizer.decode(outputs[0], skip_special_tokens=True))