allura-quants
/

allura-org_Q3-8B-Kintsugi-GGUF

Model card Files Files and versions

Static GGUF Quants of allura-org/Q3-8B-Kintsugi

GGUF quants of allura-org/Q3-8B-Kintsugi using llama.cpp for quantization.

Quants

Quant	Link	Notes
FP16	Download	Lossless. Not recommended, unless you have an excessive amount of VRAM.
Q8_0	Download	Basically lossless, half the size of FP16.
Q6_K	Download	Near-lossless, slightly smaller than Q8.
Q5_K_M	Download	Good quality/size balance; smaller than Q6_K with some loss.
Q5_K_S	Download	Slightly smaller than Q5_K_M, marginally more compressed but usable.
Q4_K_M	Download	Okay for some tasks; significantly smaller than Q5 variants.
Q4_K_S	Download	More compact than Q4_K_M; suitable for limited memory devices.
Q3_K_M	Download	Very small size; noticeable quality loss.
Q3_K_S	Download	Even more compact than Q3_K_M, even greater quality trade-off for size.
Q2_K	Download	Very small, minimal resources; near-incoherent for most usecases. Not recommended unless you are on a Samsung Galaxy S5.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for allura-quants/allura-org_Q3-8B-Kintsugi-GGUF

Base model

Qwen/Qwen3-8B-Base

Finetuned

allura-org/Q3-8B-Kintsugi

Quantized

(13)

this model

Datasets used to train allura-quants/allura-org_Q3-8B-Kintsugi-GGUF