How to increase context length to 256k?
#21
by
JC1DA
- opened
It seems by default it's only 32k. How can I properly increase it to 256k in VLLM?
Thanks
Thanks for your support,
we've update the HF's readme doc, one section was added about how to serve model in 256k in vLLM.
JC1DA
changed discussion status to
closed