💥 Qwenite3.5-0.8B

📄 Overview


Model Name	constructai/Qwenite3.5-0.8B
Base Model	Qwen3.5-0.8B-Base
Dataset	constructai/Granite-v4.1-Distilled-15K
Training Type	Supervised Fine-Tuning (SFT)
Parameters	0.9B
Framework	Unsloth + LoRA
Hardware	NVIDIA T4 16GB

🎯 Intended Use

This model is designed for step‑by‑step reasoning tasks where the answer requires logical decomposition before the final response. It is optimized for:

Educational applications — explaining "why" and "how" questions
On‑device assistants — runs on mobile, Raspberry Pi, or CPU‑only environments
Fast prototyping — small footprint (0.9B parameters), low latency
Reasoning distillation research — studying how small models learn from large ones (Granite → Qwen)

Not recommended for: multimodal tasks, non‑reasoning chat (e.g., creative writing), or production systems requiring 100% factual accuracy.

⚠️ Limitations & Intended Use

Intended Use:

Educational & Reasoning tasks — explaining step‑by‑step logic (math, science, common sense)
On‑device assistants — runs on CPU, Raspberry Pi, mobile (small footprint, fast inference)
Research baseline — for studying SFT‑only reasoning without RLHF/DPO
Distillation experiments — testing how well small models learn from large (Granite → Qwen)

Limitations:

Size matters — 0.9B parameters, so complex or multi‑hop reasoning may still fail
No multimodal — text only; images, video, audio are not supported
Factual accuracy — may hallucinate or give incorrect answers; always verify critical outputs
Domain restricted — trained on 15,000 reasoning examples (2.5 epochs); general chat or creative writing may be suboptimal
Training data bias — inherits biases from constructai/Granite-v4.1-Distilled-15K dataset; not safety‑filtered for harmful content
Hardware specific — optimised for T4/consumer GPUs; very slow on CPU without quantisation

Train details

This experiment went surprisingly well, and the small Qwen3.5-0.8B-Base model performed an excellent job, showing decent results. Thanks to the correctly selected LoRA hyperparameters (r=32, alpha=64) and the use of a high-quality synthetic dataset Granite-v4.1-Distilled-15K, the loss was lowered below 0.8, and the model consistently gives correct answers on validation examples (as in the task about monkeys on branches). You can try out Qwenite3.5-0.8B using this code:


from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "constructai/Qwenite3.5-0.8B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")

def ask(question):
    prompt = f"<|im_start|>user\n{question}\nAnswer concisely:<|im_end|>\n<|im_start|>assistant\n"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.1, do_sample=True)
    answer = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
    return answer

test_questions = [
    "On one branch there are 2 monkeys. On two such branches there are 4 monkeys. Now answer: How many on 3 branches?",
]

for q in test_questions:
    print(f"Q: {q}")
    print(f"A: {ask(q)}\n{'-'*50}")

🙏 Acknowledgements

This project would not have been possible without the open‑source community and the following resources:

Qwen Team (Alibaba Cloud) — for releasing the Qwen3.5-0.8B-Base model under Apache 2.0, a perfect balance of size and intelligence.
Unsloth AI — for making fine‑tuning on consumer hardware fast and memory‑efficient.
Hugging Face — for the ecosystem (transformers, datasets, PEFT, Hub) that democratises LLM training.
Kaggle — for providing free T4 GPU runtime to run this experiment.

📖 Citation

@misc{Qwenite3.5-0.8B,
  author = {constructai},
  title = {Qwenite3.5-0.8B: Small Reasoning Model via SFT on Granite Traces},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {https://huggingface.co/constructai/Qwenite3.5-0.8B},
}

Downloads last month: -

Safetensors

Model size

0.8B params

Tensor type

BF16

Model tree for constructai/Qwenite3.5-0.8B

Quantizations

1 model

Dataset used to train constructai/Qwenite3.5-0.8B

Collection including constructai/Qwenite3.5-0.8B

Qwenite

Collection

2 items • Updated about 19 hours ago