Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Novita
Replicate
SambaNova
Fireworks
Nebius AI Studio
Cerebras
Together AI
Hyperbolic
Nscale
fal
Cohere
HF Inference API
Misc
Reset Misc
Inference Endpoints
text-generation-inference
image-text-to-text
custom_code
4-bit precision
Merge
8-bit precision
Eval Results
Mixture of Experts
Carbon Emissions
Misc with no match
text-embeddings-inference
Apply filters
Models
11,105
Full-text search
Edit filters
Sort: Trending
Active filters:
image-text-to-text
Clear all
openbmb/MiniCPM-V-2_6
Image-Text-to-Text
•
Updated
Jan 15
•
63.6k
•
976
Qwen/Qwen2-VL-2B-Instruct
Image-Text-to-Text
•
Updated
Jan 12
•
536k
•
424
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
684
•
1.67k
google/cxr-foundation
Image Classification
•
Updated
Feb 20
•
80
•
76
osunlp/UGround-V1-7B
Image-Text-to-Text
•
Updated
Apr 16
•
2.14k
•
16
ByteDance-Seed/UI-TARS-72B-DPO
Image-Text-to-Text
•
Updated
Jan 25
•
6.07k
•
132
remyxai/SpaceQwen2.5-VL-3B-Instruct
Image-Text-to-Text
•
Updated
3 days ago
•
89.7k
•
10
ibm-granite/granite-vision-3.2-2b
Image-Text-to-Text
•
Updated
Apr 14
•
5.6k
•
95
google/gemma-3-4b-pt
Image-Text-to-Text
•
Updated
Mar 21
•
56.4k
•
84
google/gemma-3-12b-it-qat-q4_0-gguf
Image-Text-to-Text
•
Updated
Apr 11
•
101k
•
139
Qwen/Qwen2.5-VL-32B-Instruct
Image-Text-to-Text
•
Updated
Apr 14
•
507k
•
•
381
Tesslate/Synthia-S1-27b
Image-Text-to-Text
•
Updated
Apr 9
•
664
•
•
77
moonshotai/Kimi-VL-A3B-Thinking
Image-Text-to-Text
•
Updated
Apr 20
•
53.4k
•
410
Skywork/SkyCaptioner-V1
Video-Text-to-Text
•
Updated
Apr 25
•
827
•
41
soob3123/amoral-gemma3-12B-v2-qat
Text Generation
•
Updated
Apr 20
•
461
•
•
17
meta-llama/Llama-Guard-4-12B
Image-Text-to-Text
•
Updated
Apr 29
•
62.8k
•
41
moondream/moondream-2b-2025-04-14-4bit
Image-Text-to-Text
•
Updated
18 days ago
•
10.6k
•
48
mlabonne/gemma-3-12b-it-qat-abliterated
Image-Text-to-Text
•
Updated
10 days ago
•
105
•
7
microsoft/GUI-Actor-3B-Qwen2.5-VL
Image-Text-to-Text
•
Updated
about 7 hours ago
•
3
prithivMLmods/docscopeOCR-7B-050425-exp-GGUF
Image-Text-to-Text
•
Updated
5 days ago
•
197
•
3
microsoft/git-base
Image-to-Text
•
Updated
Apr 24, 2023
•
377k
•
98
liuhaotian/llava-v1.5-7b
Image-Text-to-Text
•
Updated
May 8, 2024
•
474k
•
468
llava-hf/LLaVA-NeXT-Video-7B-hf
Video-Text-to-Text
•
Updated
Jan 27
•
157k
•
101
microsoft/Florence-2-large
Image-Text-to-Text
•
Updated
Dec 8, 2024
•
758k
•
1.56k
OpenGVLab/InternVL2-1B
Image-Text-to-Text
•
Updated
Mar 25
•
38.1k
•
73
5CD-AI/Vintern-1B-v2
Image-Text-to-Text
•
Updated
Jan 17
•
2.9k
•
72
Qwen/Qwen2-VL-72B-Instruct
Image-Text-to-Text
•
Updated
Feb 6
•
27.3k
•
•
303
meta-llama/Llama-3.2-11B-Vision
Image-Text-to-Text
•
Updated
Sep 27, 2024
•
29.9k
•
519
meta-llama/Llama-3.2-90B-Vision-Instruct
Image-Text-to-Text
•
Updated
Mar 4
•
13.4k
•
•
342
unsloth/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text
•
Updated
Dec 10, 2024
•
28.2k
•
81
Previous
1
2
3
4
5
...
100
Next