YOLO-Coder

Website  |  GitHub  |  Twitter  |  Dataset  |  YOLO-Coder-1.5B

License: MIT  |  Author: @erdemwrites

YOLO-Coder-8B

Fix broken CLI commands. One command output. Runs 100% locally. Fine-tuned Qwen2.5-Coder-7B · MLX LoRA on Apple Silicon · No API key needed

🎯 Task CLI error → single bare bash fix command
🏆 Accuracy 77.1% pipeline×3 · 59.2% raw LLM (beats GPT-4o)
💾 Size ~4.4GB Q4_K_M GGUF · ~6GB RAM
Speed 1–3s on Apple Silicon
🔒 Privacy 100% local · no API key · no telemetry

Quickstart

ollama run hf.co/erdemozkan/YOLO-Coder-8B "ModuleNotFoundError: No module named 'flask'"
# → pip install flask

That's it. No account. No cloud. No cost per call.

Benchmark — YOLO-Bench

218 verified CLI errors · structural match scoring (flag-order-independent)

YOLO-Coder-8B  pipeline×3  ████████████████████  77.1%  ★ best overall
YOLO-Coder-1.5B pipeline×3 ██████████████████    71.1%
Claude Sonnet  raw         ████████████████       60.1%
YOLO-Coder-8B  raw         ███████████████        59.2%  ★ best offline
GPT-4o         raw         ████████████           48.6%
YOLO-Coder-1.5B raw        ██████████             42.2%
Mode Structural Match
Raw LLM (no pipeline) 59.2%
Pipeline × 1 (interceptors + LLM) 72.0%
Pipeline × 3 (interceptors + memory + 3 LLM attempts) 77.1%

YOLO-Coder-8B pipeline×3 is the highest score of any model tested — including GPT-4o and Claude Sonnet — running entirely offline.

Scoring code and dataset: github.com/erdemozkan/YOLO-CODER/tree/main/benchmark

How the pipeline works

Your error → [91 interceptors <1ms] → [fix memory <5ms] → [LLM 1-3s] → Fix
                ↑ ~50% of fixes stop here

Half of all fixes never reach the LLM. The model is the safety net, not the first guess.

Usage with YOLO-CODER

pip install yolo-coder

yoco python3 myapp.py        # 8B is the default
yoco npm run dev
yoco --model hf.co/erdemozkan/YOLO-Coder-8B python3 myapp.py

Prompt format (ChatML)

<|im_start|>system
You are a CLI repair tool. Output ONLY a single bare bash command to fix the error. No explanation. No markdown. No backticks.<|im_end|>
<|im_start|>user
[Linux] $ python3 myapp.py
Error:
ModuleNotFoundError: No module named 'requests'
FIX:<|im_end|>
<|im_start|>assistant
pip install requests<|im_end|>

Training

"Trained on a MacBook Air. No rented A100s."

Property Value
Base model Qwen/Qwen2.5-Coder-7B-Instruct
Fine-tune method LoRA via MLX on Apple Silicon
LoRA rank / scale 8 / 20.0
Layers trained 28
Training iterations 500
Learning rate 1e-5
Training examples 6,719 error/fix pairs across 15 categories
Export Merged weights → Q4_K_M GGUF for Ollama

Files

File Description
YOLO-Coder-8B-Q4_K_M.gguf Q4_K_M quantized GGUF (~4.4GB) — use this with Ollama
safetensors/ fp16 safetensors — for further fine-tuning

1.5B vs 8B

YOLO-Coder-1.5B YOLO-Coder-8B
Size ~941MB ~4.4GB
RAM needed ~2GB ~6GB
Speed <1s on Apple Silicon 1–3s on Apple Silicon
Raw accuracy 42.2% 59.2%
Pipeline×3 accuracy 71.1% 77.1%
Best for Speed, low-RAM machines Hard errors, best accuracy

Limitations

  • Single-command output only — not designed for multi-step fixes without a wrapper
  • Complex or highly novel errors may produce suboptimal output
  • Not a general-purpose coding assistant

License

MIT

Downloads last month
2,184
MLX
Hardware compatibility
Log In to add your hardware

Quantized

GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for erdemozkan/YOLO-Coder-8B

Base model

Qwen/Qwen2.5-7B
Adapter
(610)
this model

Space using erdemozkan/YOLO-Coder-8B 1