SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction
[๐ GitHub] [๐ฆ Benchmark] [๐ Homepage] [๐ Paper]
Highlights
- ๐ฅWe introduce Segment Concept (SeC), a concept-driven segmentation framework for video object segmentation that integrates Large Vision-Language Models (LVLMs) for robust, object-centric representations.
- ๐ฅSeC dynamically balances semantic reasoning with feature matching, adaptively adjusting computational efforts based on scene complexity for optimal segmentation performance.
- ๐ฅWe propose the Semantic Complex Scenarios Video Object Segmentation (SeCVOS) benchmark, designed to evaluate segmentation in challenging scenarios.
SeC Performance
Model | SA-V val | SA-V test | LVOS v2 val | MOSE val | DAVIS 2017 val | YTVOS 2019 val | SeCVOS |
---|---|---|---|---|---|---|---|
SAM 2.1 | 78.6 | 79.6 | 84.1 | 74.5 | 90.6 | 88.7 | 58.2 |
SAMURAI | 79.8 | 80.0 | 84.2 | 72.6 | 89.9 | 88.3 | 62.2 |
SAM2.1Long | 81.1 | 81.2 | 85.9 | 75.2 | 91.4 | 88.7 | 62.3 |
SeC (Ours) | 82.7 | 81.7 | 86.5 | 75.3 | 91.3 | 88.6 | 70.0 |
Citation
If you find this project useful in your research, please consider citing:
@article{zhang2025sec,
title = {SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction},
author = {Zhixiong Zhang and Shuangrui Ding and Xiaoyi Dong and Songxin He and Jianfan Lin and Junsong Tang and Yuhang Zang and Yuhang Cao and Dahua Lin and Jiaqi Wang},
journal = {arXiv preprint arXiv:2507.15852},
year = {2025}
}
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for OpenIXCLab/SeC-4B
Base model
facebook/sam2.1-hiera-large