Diankun
/

Spatial-MLLM-subset-sft

Video-Text-to-Text

text-generation

text-generation-inference

Model card Files Files and versions Community

Spatial-MLLM-subset-sft / README.md

Diankun's picture

Improve model card: Correct pipeline tag, add library name, and link project page (#1)

7c07e6a verified 2 months ago

|

history blame contribute delete

402 Bytes

metadata

base_model:
  - Qwen/Qwen2.5-VL-3B-Instruct
license: mit
pipeline_tag: video-text-to-text
library_name: transformers

This repository contains the model described in Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence.

Project page: https://diankun-wu.github.io/Spatial-MLLM/

Code: https://github.com/diankun-wu/Spatial-MLLM