---
base_model:
- microsoft/Phi-3-vision-128k-instruct
license: mit
pipeline_tag: video-text-to-text
---


# Model Card for VideoChat-Online

This modelcard aims to give the model info of 'Online Video Understanding: OVBench and VideoChat-Online'.

## Model Details

### 🛠Usage
Check the [Demo](https://github.com/MCG-NJU/VideoChat-Online#-demo).

### 📃Model Sources

- **Repository:** [VideoChat-Online](https://github.com/MCG-NJU/VideoChat-Online)
- **Paper:** [2501.00584](https://arxiv.org/abs/2501.00584v1)

## ✏️Citation

If you find this work useful for your research, please consider citing VideoChatOnline. Your acknowledgement would greatly help us in continuing to contribute resources to the research community.

```
@article{huang2024online,
  title={Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method},
  author={Huang, Zhenpeng and Li, Xinhao and Li, Jiaqi and Wang, Jing and Zeng, Xiangyu and Liang, Cheng and Wu, Tao and Chen, Xi and Li, Liang and Wang, Limin},
  journal={arXiv preprint arXiv:2501.00584},
  year={2024}
}
```