💥 QwenGLM3.5-0.8B
📄 Overview
| Model Name | QwenGLM3.5-0.8B |
| Base Model | Qwen3.5-0.8B-Base |
| Dataset | Jackrong/GLM-5.1-Reasoning-1M-Cleaned (5,000 samples) |
| Training Type | Supervised Fine-Tuning (SFT) |
| Parameters | 0.9B |
| Framework | Unsloth + LoRA |
| Hardware | NVIDIA T4 16GB |
🎯 Intended Use
This model is designed for step‑by‑step reasoning tasks where the answer requires logical decomposition before the final response. It is optimized for:
- Educational applications — explaining "why" and "how" questions
- On‑device assistants — runs on mobile, Raspberry Pi, or CPU‑only environments
- Fast prototyping — small footprint (0.9B parameters), low latency
- Reasoning distillation research — studying how small models learn from large ones (GLM → Qwen)
Not recommended for: multimodal tasks, non‑reasoning chat (e.g., creative writing), or production systems requiring 100% factual accuracy.
⚠️ Limitations & Intended Use
Intended Use:
Educational & Reasoning tasks — explaining step‑by‑step logic (math, science, common sense)
On‑device assistants — runs on CPU, Raspberry Pi, mobile (small footprint, fast inference)
Research baseline — for studying SFT‑only reasoning without RLHF/DPO
Distillation experiments — testing how well small models learn from large (GLM → Qwen)
Limitations:
Size matters — 0.9B parameters, so complex or multi‑hop reasoning may still fail
No multimodal — text only; images, video, audio are not supported
Factual accuracy — may hallucinate or give incorrect answers; always verify critical outputs
Domain restricted — trained on 5,000 reasoning examples; general chat or creative writing may be suboptimal
Training data bias — inherits biases from GLM-5.1-Reasoning dataset; not safety‑filtered for harmful content
Hardware specific — optimised for T4/consumer GPUs; very slow on CPU without quantisation
🙏 Acknowledgements
This project would not have been possible without the open‑source community and the following resources:
Jackrong — for creating the GLM-5.1-Reasoning-1M-Cleaned dataset and sharing detailed fine‑tuning insights. His work inspired the QwenGLM concept.
Qwen Team (Alibaba Cloud) — for releasing the Qwen3.5-0.8B-Base model under Apache 2.0, a perfect balance of size and intelligence.
Unsloth AI — for making fine‑tuning on consumer hardware fast and memory‑efficient.
Hugging Face — for the ecosystem (transformers, datasets, PEFT, Hub) that democratises LLM training.
Kaggle — for providing free T4 GPU runtime to run this experiment.
📖 Citation
@misc{QwenGLM3.5-0.8B,
author = {constructai},
title = {QwenGLM3.5-0.8B: Small Reasoning Model via SFT on GLM Traces},
year = {2026},
publisher = {Hugging Face},
howpublished = {https://huggingface.co/constructai/QwenGLM3.5-0.8B},
}
- Downloads last month
- 71