https://huggingface.co/collections/qingy2024/qwen-3-vlto
Would love some GGUFs of the 32B Thinking/Instruct. It's basically Qwen3 VL 32B Instruct/Thinking but without the vision part. So essentially it's an upgraded Qwen3 32B that works with llama.cpp.
They are all queued! :D
Thanks a lot for creating them. Qwen3 VL not yet being supported by llama.cpp is such a shame. I wasn't aware that you can simply remove the vision part from them. That's so cool. Great work!
You can check for progress at http://hf.tst.eu/status.html or regularly check the model summary pages for quants to appear under:
- https://hf.tst.eu/model#Qwen3-VLTO-32B-Thinking-GGUF
- https://hf.tst.eu/model#Qwen3-VLTO-32B-Instruct-GGUF
- https://hf.tst.eu/model#Qwen3-VLTO-8B-Thinking-GGUF
- https://hf.tst.eu/model#Qwen3-VLTO-8B-Instruct-GGUF
- https://hf.tst.eu/model#Qwen3-VLTO-4B-Instruct-GGUF
- https://hf.tst.eu/model#Qwen3-VLTO-1.7B-Instruct-GGUF
Awesome, thanks for queueing them!!
I just checked and apparently the Qwen3 VL PR was approved just 2 hours ago, so it looks like support is coming soon anyway, but the new 32B's and 8B's have some nice text-only performance improvements so I think it's pretty nice for people who want to fine-tune/run without having to deal with the more finicky aspect of VL models :)