tokenizer_config.json is not correct

#1
by depasquale - opened
MLX Community org

tokenizer_config.json does not match the file from the base model here: https://huggingface.co/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B/blob/main/tokenizer_config.json

@warshanks , please fix this. I'm wondering how this could have happened. Normally the config files should be copied over from the base model. This is causing the model not to behave properly in applications, because the chat template is missing. This probably also needs to be fixed for any other variants that you uploaded.

I'll also ping @awni , since I think he has the ability to edit repos in this community.

depasquale changed discussion title from `tokenizer_config` not correct to tokenizer_config.json is not correct
MLX Community org

It is odd. I'm re-uploading the full thing.

MLX Community org

Should be correct now.

awni changed discussion status to closed
MLX Community org

Weird, thanks awni. I'm going to try updating the others just in case.

MLX Community org

Actually on second thought I don't think it's necessary as it was correct before. When we convert the model from Hugging Face to MLX the tokenizer gets saved by Hugging Face. And Hugging Face must do a transformation on it that changes it. If you look in this repo I just reconverted it but the tokenizer config is unchanged.

MLX Community org

See e.g. https://github.com/ml-explore/mlx-lm/blob/main/mlx_lm/utils.py#L511

@depasquale did you open this issue just based on seeing a difference or did you notice issues in behavior. Cause the model works fine for me with mlx-lm.

MLX Community org

I'm not sure if this issue is related, but I figured I'd mention it.

https://huggingface.co/mlx-community/DeepSeek-R1-0528-Qwen3-8B-8bit/discussions/1

MLX Community org

Maybe things work differently on the Swift side. The chat template is taken from tokenizer_config.json, so it needs to be present in that file. Otherwise you get the behavior that I was seeing in my app: No chat template is used, and the raw prompt is submitted to the model. A warning is logged to the console in Swift Transformers. The model predicts a continuation of the prompt, since no chat template was used.

MLX Community org

I think it's possible this is a recent change with tokenizers / transformers. Prior the chat_template was specified in the tokenizer_config.json and it looks like it's moved to the .jinja file. Maybe an update to transformers will fix the issue.

MLX Community org

And @depasquale if this is how transformers does things moving forward then the right fix would be to update swift_transformers to read the jinja file. Let's dig in a little bit so we know what's going on here.

MLX Community org

Relevant PR https://github.com/huggingface/transformers/pull/33957 from a few months back. It looks like they are moving towards chat_template.jinja.

@Rocketknight1 is it correct that we should be updating downstream code to expect a chat_template.jinja for all cases? Presumably we need to make this change on the Swift Transformers side as well. CC @pcuenq @JohnMai

MLX Community org

Thank you! I opened an issue in swift-transformers to track this: https://github.com/huggingface/swift-transformers/issues/204

recent related discussion and yes transformers/tokenizers now save the chat_template into its own Jinja file by default. (with power user support for multiple chat-templates, each into its own file)

Sign up or log in to comment