How to increase context length to 256k?

#21
by JC1DA - opened

It seems by default it's only 32k. How can I properly increase it to 256k in VLLM?
Thanks

Tencent org

Thanks for your support,
we've update the HF's readme doc, one section was added about how to serve model in 256k in vLLM.

Thanks @asherszhang , appreciate it

JC1DA changed discussion status to closed

Sign up or log in to comment