|
---
|
|
license: apache-2.0
|
|
tags:
|
|
- smollm
|
|
- python
|
|
- code-generation
|
|
- instruct
|
|
- qlora
|
|
- fine-tuned
|
|
- code
|
|
- nf4
|
|
datasets:
|
|
- flytech/python-codes-25k
|
|
model-index:
|
|
- name: HF-SmolLM-1.7B-0.5B-4bit-coder
|
|
results: []
|
|
language:
|
|
- en
|
|
pipeline_tag: text-generation
|
|
---
|
|
|
|
# HF-SmolLM-1.7B-0.5B-4bit-coder |
|
|
|
## Model Summary |
|
**HF-SmolLM-1.7B-0.5B-4bit-coder** is a fine-tuned variant of [SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B), optimized for **instruction-following in Python code generation tasks**. |
|
It was trained on a **1,500-sample subset** of the [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k) dataset using **parameter-efficient fine-tuning (QLoRA 4-bit)**. |
|
|
|
The model is suitable for: |
|
- Generating Python code snippets from natural language instructions |
|
- Completing short code functions |
|
- Educational prototyping of fine-tuned LMs |
|
|
|
⚠️ This is **not a production-ready coding assistant**. Generated outputs must be manually reviewed before execution. |
|
|
|
--- |
|
|
|
## Intended Uses & Limitations |
|
|
|
### ✅ Intended |
|
- Research on parameter-efficient fine-tuning |
|
- Educational demos of instruction-tuning workflows |
|
- Prototype code generation experiments |
|
|
|
### ❌ Not Intended |
|
- Deployment in production coding assistants |
|
- Safety-critical applications |
|
- Long-context multi-file programming tasks |
|
|
|
--- |
|
|
|
## Training Details |
|
|
|
### Base Model |
|
- **Name:** [HuggingFaceTB/SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B) |
|
- **Architecture:** Decoder-only causal LM |
|
- **Total Parameters:** 1.72B |
|
- **Fine-tuned Trainable Parameters:** ~9M (0.53%) |
|
|
|
### Dataset |
|
- **Source:** [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k) |
|
- **Subset Used:** 1,500 randomly sampled examples |
|
- **Content:** Instruction + optional input → Python code output |
|
- **Formatting:** Converted into `chat` format with `user` / `assistant` roles |
|
|
|
### Training Procedure |
|
- **Framework:** Hugging Face Transformers + TRL (SFTTrainer) |
|
- **Quantization:** 4-bit QLoRA (nf4) with bfloat16 compute when available |
|
- **Effective Batch Size:** 6 (with accumulation) |
|
- **Optimizer:** AdamW |
|
- **Scheduler:** Cosine decay with warmup ratio 0.05 |
|
- **Epochs:** 3 |
|
- **Learning Rate:** 2e-4 |
|
- **Max Seq Length:** 64 tokens (training) |
|
- **Mixed Precision:** FP16 |
|
- **Gradient Checkpointing:** Enabled |
|
|
|
--- |
|
|
|
## Evaluation |
|
No formal benchmark evaluation has been conducted yet. |
|
Empirically, the model: |
|
- Produces syntactically valid Python code for simple tasks |
|
- Adheres to given instructions with reasonable accuracy |
|
- Struggles with multi-step reasoning and long code outputs |
|
|
|
--- |
|
|
|
## Example Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
repo = "sweatSmile/HF-SmolLM-1.7B-0.5B-4bit-coder" |
|
tokenizer = AutoTokenizer.from_pretrained(repo) |
|
model = AutoModelForCausalLM.from_pretrained(repo, device_map="auto") |
|
|
|
prompt = "Write a Python function that checks if a number is prime." |
|
inputs = tokenizer.apply_chat_template( |
|
[{"role": "user", "content": prompt}], |
|
return_tensors="pt", |
|
add_generation_prompt=True |
|
).to(model.device) |
|
|
|
outputs = model.generate(inputs, max_new_tokens=150) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |