This is a MXFP4_MOE quantization of the model Qwen3-VL-235B-A22B-Thinking

This is the version from unloth that has expanded the context size from 256k to 1M.

Download the latest llama.cpp to use them.

GGUF

Model size

235B params

Architecture

qwen3vlmoe

Hardware compatibility

4-bit

Model tree for noctrex/Qwen3-VL-235B-A22B-Thinking-1M-MXFP4_MOE-GGUF

Base model

Quantized

(13)

this model

Collection including noctrex/Qwen3-VL-235B-A22B-Thinking-1M-MXFP4_MOE-GGUF