Diankun's picture
Improve model card: Correct pipeline tag, add library name, and link project page (#1)
7c07e6a verified
metadata
base_model:
  - Qwen/Qwen2.5-VL-3B-Instruct
license: mit
pipeline_tag: video-text-to-text
library_name: transformers

This repository contains the model described in Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence.

Project page: https://diankun-wu.github.io/Spatial-MLLM/

Code: https://github.com/diankun-wu/Spatial-MLLM