This is basically a test to see if the conversion and inference in llama.cpp works fine It seems to work though i wont add more quant sizes for now

Since this is merely a quantization of the original model the license of the original model still applies!

GGUF

Model size

752M params

Architecture

qwen3

Hardware compatibility

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for QuantStack/InternVL3_5-1B-gguf

Base model

Finetuned

Finetuned

Finetuned

Quantized

(4)

this model

Collection including QuantStack/InternVL3_5-1B-gguf