Vision LLM - a SamoXXX Collection

SamoXXX 's Collections

Vision LLM

updated May 5, 2025

Collecting best Vision LLMs - to study and learn from them

rhymes-ai/Aria

Image-Text-to-Text • 25B • Updated Apr 23, 2025 • 34.2k • 637
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 355 • 1.7k
jadechoghari/Ferret-UI-Gemma2b

Image-Text-to-Text • 3B • Updated Oct 18, 2024 • 207 • 50
jadechoghari/Ferret-UI-Llama8b

Image-Text-to-Text • 8B • Updated Jan 8, 2025 • 217 • 68
gpt-omni/mini-omni2

Any-to-Any • Updated Oct 24, 2024 • 69 • 279
mPLUG/DocOwl2

Image-Text-to-Text • 9B • Updated Sep 27, 2024 • 222 • 113
google/siglip-so400m-patch16-256-i18n

Zero-Shot Image Classification • 1B • Updated Nov 18, 2024 • 581 • 30
openvla/openvla-7b

Image-Text-to-Text • 8B • Updated Sep 16, 2024 • 1.42M • 161
NexaAI/OmniVLM-968M

0.5B • Updated Aug 20, 2025 • 1.76k • 529
Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 2.29M • • 1.42k
ByteDance-Seed/UI-TARS-7B-SFT

Image-Text-to-Text • 8B • Updated Jan 25, 2025 • 829 • 177
moonshotai/Kimi-VL-A3B-Instruct

Image-Text-to-Text • 16B • Updated Jul 30, 2025 • 101k • 246
reducto/RolmOCR

Image-to-Text • 8B • Updated Apr 2, 2025 • 3.42k • 570