Efficient-Large-Model
/

NVILA-AWQ

Model card Files Files and versions Community

NVILA-AWQ / README.md

Louym's picture

Update README.md

fb043ec verified 3 months ago

|

history blame contribute delete

821 Bytes

	---
	license: apache-2.0
	---
	Here, we provide AWQ-quantized versions of the most popular NVILA models. These files help you seamlessly deploy TinyChat to unlock the full potential of NVILA and your hardware.

	One-command demo to chat with quantized NVILA models via [llm-awq](https://github.com/mit-han-lab/llm-awq/tree/main/tinychat#:~:text=TinyChat%20support%20NVILA) (NVILA-8B as an example):

	```bash
	cd llm-awq/tinychat
	python nvila_demo.py --model-path PATH/TO/NVILA \
	--quant_path NVILA-8B-w4-g128-awq-v2.pt \
	--media PATH/TO/ANY/IMAGES/VIDEOS \
	--act_scale_path NVILA-8B-VT-smooth-scale.pt \
	--all --chunk --model_type nvila
	```

	This command will download the quantized NVILA model and run a chat demo. If you’ve already downloaded the files, simply set the path to your local copies.