Missing MTP?

#1
by jondurbin - opened

It seems the mtp.safetensors file (and index references) were not included in the FP8 version, and MTP doesn't work for this reason (e.g. in vLLM/SGLang).

GLM-4.6 and GLM-4.5 BF16 model without MTP both but only GLM-4.5 FP8 have mtp

FYI, this isn't true: the FP8 weights for GLM-4.5-FP8 include layer 92, and the various MTP weights like eh_proj, enorm, hnorm etc: https://huggingface.co/zai-org/GLM-4.5-FP8/blob/main/model-00093-of-00093.safetensors

Intriguingly, eh_proj is in FP16 rather than FP8... Perhaps due to transformers ignoring layer 92, resulting in downstream tools like compressors also ignoring it?

I've revised the response, GLM-4.6-FP8 doesn't have an MTP layer, only GLM-4.5-FP8 does.

I've revised the response, GLM-4.6-FP8 doesn't have an MTP layer, only GLM-4.5-FP8 does.

Would you add MTP to GLM 4.6 FP8? Or is there any constraint that MTP cannot be used?

Sign up or log in to comment