Verus
Collection
2 items • Updated
This repository contains model weights and configuration files for Verus-4B in the Hugging Face Transformers format.
Compatible with Hugging Face Transformers, vLLM, SGLang, llama.cpp (GGUF export), and other major inference frameworks.
Primary intended use cases are code generation, code review, debugging, and general coding assistance.
| Property | Value |
|---|---|
| Parameters | ~4B |
| Context Length | 262,144 tokens |
| Architecture | Qwen3.5 |
| Chat Format | ChatML (<|im_start|> / <|im_end|>) |
| Dtype | bfloat16 |
| License | Apache 2.0 |
pip install "transformers>=4.52.0" accelerate torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
MODEL_ID = "8F-ai/Verus-4B"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model.eval()
messages = [
{
"role": "system",
"content": "You are Verus, a coding assistant made by 8F-ai. You help with coding tasks and keep responses focused and clean."
},
{
"role": "user",
"content": "Write a Python async context manager that manages a PostgreSQL connection pool using asyncpg."
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.inference_mode():
generated_ids = model.generate(**inputs, max_new_tokens=2048, temperature=0.1, top_p=0.95)
output = tokenizer.decode(generated_ids[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(output)
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
MODEL_ID = "8F-ai/Verus-4B"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model.eval()
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": "path/to/screenshot.png"},
{"type": "text", "text": "Convert this UI screenshot into a React component using Tailwind CSS."}
]
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.inference_mode():
generated_ids = model.generate(**inputs, max_new_tokens=2048, temperature=0.1, top_p=0.95)
output = tokenizer.decode(generated_ids[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(output)
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
)
tokenizer = AutoTokenizer.from_pretrained("8F-ai/Verus-4B")
model = AutoModelForCausalLM.from_pretrained(
"8F-ai/Verus-4B",
quantization_config=quantization_config,
device_map="auto",
)
| Use Case | Example |
|---|---|
| Code Generation | Write functions, classes, scripts in any language |
| Debugging | Identify and fix bugs from error messages or code |
| Code Review | Suggest improvements, catch issues, explain code |
| UI to Code | Convert screenshots or diagrams into working code |
| Long Context Codebase | Reason over entire repos up to ~200K tokens |
| General Q&A | Answer programming questions clearly and concisely |
@misc{verus4b2026,
title = {Verus-4B: A Coding-Focused Multimodal Language Model with 262K Context},
author = {8F-ai},
year = {2026},
howpublished = {\url{https://huggingface.co/8F-ai/Verus-4B}},
note = {Apache 2.0 License}
}
Verus-4B is released under the Apache License 2.0. See LICENSE for full terms.
Derived from Qwen/Qwen3.5-4B (Apache 2.0).