Qwen3-0.6B-IF-Expert

This project performs full fine-tuning on the Qwen3-0.6B language model to enhance its instruction-following and reasoning capabilities. Training was conducted on the patrickfleith/instruction-freak-reasoning dataset using bfloat16 (bf16) precision for efficient optimization.

Training Procedure

  1. Dataset Preparation

    • The patrickfleith/instruction-freak-reasoning dataset was used.
    • Each example contains a complex instruction paired with an in-depth reasoning-based response.
    • Prompts were structured to encourage chain-of-thought style outputs when applicable.
  2. Model Loading and Configuration

    • Qwen3 base model weights were loaded via the unsloth library in bf16 precision.
    • All model layers were fully updated (full_finetuning=True) to effectively adapt the model to instruction understanding and stepwise response generation.
  3. Supervised Fine-Tuning

    • Fine-tuning was conducted using the Hugging Face TRL library with the Supervised Fine-Tuning (SFT) approach.
    • The model was trained to follow detailed instructions, reason logically, and generate structured responses.

Purpose and Outcome

  • The model’s ability to follow complex instructions and explain its reasoning process has been significantly enhanced.
  • It generates both coherent reasoning steps and conclusive answers, improving transparency and usability for instruction-based tasks.

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Support

Buy Me A Coffee

Downloads last month
28
Safetensors
Model size
596M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for suayptalha/Qwen3-0.6B-IF-Expert

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(223)
this model

Dataset used to train suayptalha/Qwen3-0.6B-IF-Expert