📦 Model Card: `Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking`

Key Property	Value
Model ID	`Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking`
License	apache-2.0
Author(s)	Jackrong, gpt‑oss team, Qwen authors
Base Model	`gpt-oss-120b-high` (complex reasoning dataset distilled)
Target Size	~ 4B parameters (`Qwen3‑4B` distilled version)

🔍 Overview

A deeply distilled and fine-tuned variant of the large‑language model gpt-oss-120b-high, optimized for human‑friendly, high‑fidelity reasoning. The model preserves the original’s multi‑step thinking patterns while compressing them onto a lightweight 4B‑parameter backbone (the “Distill‑Qwen3” architecture). Its signature feature is an explicit point‑by‑point thought chain that makes intricate logic transparent and easy to follow, ideal for education, technical support, and analytical tasks.

💡 Think of it as the “thinking mode” you’d expect from a massive

🛠️ Technical Details

Aspect	Specification
Source Model	`gpt-oss‑120b‑high` (complex reasoning dataset distilled)
Distillation Target	Qwen3‑4B architecture
Supervised Fine‑Tuning (SFT)	~ 30,000 examples drawn from the source’s high‑fidelity reasoning corpus
Training Hardware	Single NVIDIA H100‑80GB GPU
Max Context Length	32 768 tokens – enables multi‑paragraph, long‑form reasoning without truncation
Reasoning Style	Default: Bullet‑point “thought chain” output (e.g., `• Step 1 → …\n• Step 2 → …`)

🎯 Recommended Use Cases

Case	When to use
Technical tutorials	Leverage bullet‑point logic for stepwise code walkthroughs
Complex queries (e.g., math, engineering)	The model’s deep reasoning helps avoid oversimplified answers
User education	Clear, scannable outputs aid learning and reduce confusion
Moderation/analysis	The structured format makes it easier to parse responses programmatically

📚 Credits & Contributors

gpt‑oss team: Provided the high‑fidelity complex‑reasoning dataset.
Qwen3 authors: Open‑source architecture used as distillation target.
Jackrong: Implemented the final SFT and packaging for Hugging Face Hub.

Downloads last month: 459

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

4-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking-GGUF

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Quantized

(166)

this model

Jackrong
/

gpt-oss-120b-Distill-Qwen3-4B-Thinking-GGUF

📦 Model Card: `Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking`

🔍 Overview

🛠️ Technical Details

🎯 Recommended Use Cases

📚 Credits & Contributors

Model tree for Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking-GGUF

Datasets used to train Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking-GGUF

📦 Model Card: Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking

🔍 Overview

🛠️ Technical Details

🎯 Recommended Use Cases

📚 Credits & Contributors

Model tree for Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking-GGUF

Datasets used to train Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking-GGUF

📦 Model Card: `Jackrong/gpt-oss-120b-Distill-Qwen3-4B-Thinking`