Qwen3-0.6B-IF-Expert
This project performs full fine-tuning on the Qwen3-0.6B language model to enhance its instruction-following and reasoning capabilities. Training was conducted on the patrickfleith/instruction-freak-reasoning
dataset using bfloat16 (bf16) precision for efficient optimization.
Training Procedure
Dataset Preparation
- The
patrickfleith/instruction-freak-reasoning
dataset was used. - Each example contains a complex instruction paired with an in-depth reasoning-based response.
- Prompts were structured to encourage chain-of-thought style outputs when applicable.
- The
Model Loading and Configuration
- Qwen3 base model weights were loaded via the
unsloth
library in bf16 precision. - All model layers were fully updated (
full_finetuning=True
) to effectively adapt the model to instruction understanding and stepwise response generation.
- Qwen3 base model weights were loaded via the
Supervised Fine-Tuning
- Fine-tuning was conducted using the Hugging Face TRL library with the Supervised Fine-Tuning (SFT) approach.
- The model was trained to follow detailed instructions, reason logically, and generate structured responses.
Purpose and Outcome
- The model’s ability to follow complex instructions and explain its reasoning process has been significantly enhanced.
- It generates both coherent reasoning steps and conclusive answers, improving transparency and usability for instruction-based tasks.
License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Support
- Downloads last month
- 28
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support