Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model: xtuner/llava-llama-3-8b-v1_1-transformers
|
| 3 |
+
library_name: gguf
|
| 4 |
+
quantized_by: city96
|
| 5 |
+
tags:
|
| 6 |
+
- image-text-to-text
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
This is an imatrix gguf conversion of [xtuner/llava-llama-3-8b-v1_1-transformers](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers).
|
| 10 |
+
|
| 11 |
+
Mainly intended to be used as the text encoder for Hunyuan Video, but possible to use for vision tasks with the [mmproj](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-gguf/blob/main/llava-llama-3-8b-v1_1-mmproj-f16.gguf) file from the xtuner gguf repository.
|
| 12 |
+
|
| 13 |
+
The imatrix dataset used was [`calibration_datav3.txt`](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8) by [Bartowski](https://huggingface.co/bartowski), which was used for all quants under Q6_K. Tested against wikitext / no imatrix and it outperformed both.
|
| 14 |
+
|
| 15 |
+
Note that the `vocab_size` is different between the [transformers](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers) (128 320) and the [hf](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-hf) (128 256) repositories. This used the former as it was what was used in the official Hunyuan Video code.
|
| 16 |
+
|
| 17 |
+
*IQ quants will be slow in ComfyUI due to using numpy fallback.*
|