vocab_size missing in IndicTransConfig — breaking generation in latest Transformers

by Aditya232003 - opened Mar 26

Aditya232003

Mar 26

Hi AI4Bharat Team,

I'm using the ai4bharat/indictrans2-en-indic-1B model for a thesis project. Until recently, everything worked perfectly, both in local and Colab environments. However, after a recent update, the model is throwing the following error during generate():

AttributeError: 'IndicTransConfig' object has no attribute 'vocab_size'

This appears to be because vocab_size is missing in the current config.json, and Hugging Face's generate() relies on it during beam search. This breaks all downstream use of the model — even simple batch translations fail.

Steps I’ve tried:

Patching config.json manually (adding vocab_size key).
Downgrading transformers + huggingface_hub.
Trying old revisions (which now 404).
Using both CPU and GPU environments — same issue.

This issue is critical for me as my final presentation is next week (I'm a final year student at Trinity College Dublin). Please advise if:

An older working snapshot can be restored.
The vocab_size can be reintroduced in config or handled in code.

Thanks again for the incredible work with IndicTrans2 — this model is genuinely important for bridging language barriers.

Best regards,
Aditya

pranjalchitale

AI4Bharat org Mar 27

This has been resolved.

pranjalchitale changed discussion status to closed Mar 27

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment