Update usage instructions and adjust model size reference

  • Updated usage examples for loading the model with Transformers
  • Updated vLLM usage, added add_special_tokens=True to ensure correct chat formatting (e.g., BOS token)
funmaker changed pull request status to closed

Sign up or log in to comment