Can we do this with less active tokens?
#3
by
ccocks-deca
- opened
https://huggingface.co/inclusionAI/Ling-1T/blob/main/config.json#L22
Can we set this to 4? How bad would performance be? Kimi K2 only uses 32.
https://huggingface.co/inclusionAI/Ling-1T/blob/main/config.json#L22
Can we set this to 4? How bad would performance be? Kimi K2 only uses 32.