|
--- |
|
license: mit |
|
metrics: |
|
- bleu |
|
- rouge |
|
- meteor |
|
- bertscore |
|
base_model: |
|
- liuhaotian/llava-v1.5-7b |
|
pipeline_tag: visual-question-answering |
|
--- |
|
|
|
# visual-qa-tem Model Card |
|
|
|
## Model details |
|
|
|
**base_model** |
|
|
|
We finetune our custom data on LLava-v1.5-7b |
|
|
|
See on :[liuhaotian/llava-v1.5-7b](https://huggingface.co/liuhaotian/llava-v1.5-7b) |
|
|
|
**Paper or resources for more information:** |
|
|
|
Our source code publish on : https://github.com/SmartLab-Roy/visual-qa-tem.git |
|
|
|
### Download Model |
|
```python |
|
from huggingface_hub import snapshot_download |
|
import os |
|
|
|
# Download the model to local directory |
|
model_path = snapshot_download( |
|
repo_id="LabSmart/visual-qa-tem", |
|
cache_dir="./models", # Local cache directory |
|
resume_download=True |
|
) |
|
|
|
print(f"Model downloaded to: {model_path}") |
|
``` |
|
### Quick Start |
|
|
|
Reference [LLaVA](https://github.com/haotian-liu/LLaVA.git) for environment setup and CLI inference: |
|
|
|
``` |
|
python -m llava.serve.cli \ |
|
--model-path "model_path from the download output"\ |
|
--image-file "path/to/your/tem_image.jpg" \ |
|
--load-4bit |
|
``` |