Spaces:
Running
on
Zero
Running
on
Zero
Apply for community grant: Academic project (gpu)
#1
by
pinoo
- opened
SkyCaptioner-V1 is a structural video captioning model designed to generate high-quality, structural descriptions for video data. It integrates specialized sub-expert models and multimodal large language models (MLLMs) with human annotations to address the limitations of general captioners in capturing professional film-related details.