ST-Think openinterx/ST-R1-mcq 8B • Updated Mar 17 • 68 openinterx/Ego-ST-video Viewer • Updated Mar 15 • 803 • 19 • 1 openinterx/Ego-ST-bench Viewer • Updated Mar 29 • 93 • 24 • 1 ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
UGC-VideoCap openinterx/UGC-VideoCap Updated Aug 20 • 98 openinterx/UGC-VideoCaptioner Video-Text-to-Text • 6B • Updated Jul 19 • 142 • 1 UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 5
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 5
ST-Think openinterx/ST-R1-mcq 8B • Updated Mar 17 • 68 openinterx/Ego-ST-video Viewer • Updated Mar 15 • 803 • 19 • 1 openinterx/Ego-ST-bench Viewer • Updated Mar 29 • 93 • 24 • 1 ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
UGC-VideoCap openinterx/UGC-VideoCap Updated Aug 20 • 98 openinterx/UGC-VideoCaptioner Video-Text-to-Text • 6B • Updated Jul 19 • 142 • 1 UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 5
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 5