OpenGVLab/InternVL3_5-30B-A3B-HF
Image-Text-to-Text • 31B • Updated
• 6.58k • 6
Computer Vision
RIVER: A Real-Time Interaction Benchmark for Video LLMs
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision