HF-SmolLM-1.7B-0.5B-4bit-coder

Model Summary

HF-SmolLM-1.7B-0.5B-4bit-coder is a fine-tuned variant of SmolLM-1.7B, optimized for instruction-following in Python code generation tasks.
It was trained on a 1,500-sample subset of the flytech/python-codes-25k dataset using parameter-efficient fine-tuning (QLoRA 4-bit).

The model is suitable for:

  • Generating Python code snippets from natural language instructions
  • Completing short code functions
  • Educational prototyping of fine-tuned LMs

⚠️ This is not a production-ready coding assistant. Generated outputs must be manually reviewed before execution.


Intended Uses & Limitations

βœ… Intended

  • Research on parameter-efficient fine-tuning
  • Educational demos of instruction-tuning workflows
  • Prototype code generation experiments

❌ Not Intended

  • Deployment in production coding assistants
  • Safety-critical applications
  • Long-context multi-file programming tasks

Training Details

Base Model

  • Name: HuggingFaceTB/SmolLM-1.7B
  • Architecture: Decoder-only causal LM
  • Total Parameters: 1.72B
  • Fine-tuned Trainable Parameters: ~9M (0.53%)

Dataset

  • Source: flytech/python-codes-25k
  • Subset Used: 1,500 randomly sampled examples
  • Content: Instruction + optional input β†’ Python code output
  • Formatting: Converted into chat format with user / assistant roles

Training Procedure

  • Framework: Hugging Face Transformers + TRL (SFTTrainer)
  • Quantization: 4-bit QLoRA (nf4) with bfloat16 compute when available
  • Effective Batch Size: 6 (with accumulation)
  • Optimizer: AdamW
  • Scheduler: Cosine decay with warmup ratio 0.05
  • Epochs: 3
  • Learning Rate: 2e-4
  • Max Seq Length: 64 tokens (training)
  • Mixed Precision: FP16
  • Gradient Checkpointing: Enabled

Evaluation

No formal benchmark evaluation has been conducted yet.
Empirically, the model:

  • Produces syntactically valid Python code for simple tasks
  • Adheres to given instructions with reasonable accuracy
  • Struggles with multi-step reasoning and long code outputs

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

repo = "sweatSmile/HF-SmolLM-1.7B-0.5B-4bit-coder"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo, device_map="auto")

prompt = "Write a Python function that checks if a number is prime."
inputs = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
-
Safetensors
Model size
1.71B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train sweatSmile/HF-SmolLM-1.7B-0.5B-4bit-coder