Missing MTP?
It seems the mtp.safetensors file (and index references) were not included in the FP8 version, and MTP doesn't work for this reason (e.g. in vLLM/SGLang).
GLM-4.6 and GLM-4.5 BF16 model without MTP both but only GLM-4.5 FP8 have mtp
FYI, this isn't true: the FP8 weights for GLM-4.5-FP8 include layer 92, and the various MTP weights like eh_proj
, enorm
, hnorm
etc: https://huggingface.co/zai-org/GLM-4.5-FP8/blob/main/model-00093-of-00093.safetensors
Intriguingly, eh_proj
is in FP16 rather than FP8... Perhaps due to transformers
ignoring layer 92, resulting in downstream tools like compressors also ignoring it?
I've revised the response, GLM-4.6-FP8 doesn't have an MTP layer, only GLM-4.5-FP8 does.
I've revised the response, GLM-4.6-FP8 doesn't have an MTP layer, only GLM-4.5-FP8 does.
Would you add MTP to GLM 4.6 FP8? Or is there any constraint that MTP cannot be used?