THUDM/GLM-4-32B-0414

#844
by x0wllaar - opened

New model (https://huggingface.co/THUDM/GLM-4-32B-0414) , they also have a bunch of other thinking and not thinking models: https://huggingface.co/collections/THUDM/glm-4-0414-67f3cbcb34dd9d252707cb2e

Should be already supported by llama.cpp

Should be already supported by llama.cpp

I haven't checked all, but unfortunately, I don't think any of them are supported yet :(

mradermacher changed discussion status to closed

llama.cpp can run them, there are other quants on here, here's the merged PR: https://github.com/ggml-org/llama.cpp/pull/12867.

The thing is that currently, silent corruption happens when trying to quantize, see https://github.com/ggml-org/llama.cpp/pull/12957 for fix

Just drop us a note once llama.cpp has support for these models, and we will quantize them.

Sign up or log in to comment