---
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-Next-80B-A3B-Thinking
---
This is a MXFP4 quant of Qwen3-Next-80B-A3B-Thinking

Welcome to the bleeding edge.  
I must to point out that this a *experimental* release.  
Say it after me, *EXPERIMENTAL*.

This has been made possible cause of the excellent work done by [pwilkin](https://github.com/pwilkin/llama.cpp) and others.

He has a development branch of llama.cpp for Qwen3-Next.  
It has not yet been released officially, and things are moving quite fast.

For the time being, as of 2025-10-24 I got the source code from his fork and compiled in order to be able to generate the GGUF's, from here:  
https://github.com/pwilkin/llama.cpp/tree/qwen3_next

This GGUF will run only with it.

If you cannot compile it yourself, I have made a Windows version with Vulkan support you can find here:  
[llama-qwen3-next-5edfe78-bin-win-vulkan-x64.zip](https://gofile.io/d/qhHL6n)  
I should state that this may trigger false positives from your AV, this has NO virus, I compiled it on a my Windows 11 PC that i check regularly for viruses.  
If you don't trust strangers giving out binaries, you can try compiling it for yourself, in order to be sure.

https://www.virustotal.com/gui/file/35a134a8977488ff6b82ce3f2b5df20da742ec212859a5e0c30813c55519f4f0

When the support for Qwen3-Next officially releases on mainline llama.cpp, I will see if these files will need a new updated quantization, and update if needed.