Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Main
Tasks
1
Libraries
Languages
Licenses
Other
Tasks
Reset Tasks
Text Generation
Any-to-Any
Image-Text-to-Text
Image-to-Text
Image-to-Image
Text-to-Image
Text-to-Video
Text-to-Speech
+ 42
Parameters
Reset Parameters
< 1B
6B
12B
32B
128B
> 500B
< 1B
> 500B
Libraries
PyTorch
google-tensorflow
TensorFlow
JAX
Transformers
Diffusers
Safetensors
ONNX
GGUF
Transformers.js
MLX
Keras
+ 41
Apps
vLLM
TGI
llama.cpp
MLX LM
LM Studio
Ollama
Jan
+ 12
Inference Providers
Cerebras
Novita
Nebius AI
Featherless AI
Fireworks
Together AI
Groq
Hyperbolic
+ 6
Apply filters
Models
4,857
Full-text search
Edit filters
Sort: Trending
Active filters:
image-text-to-text
Clear all
unsloth/QVQ-72B-Preview
Image-Text-to-Text
•
73B
•
Updated
Dec 25, 2024
•
12
•
3
osunlp/UGround-V1-7B
Image-Text-to-Text
•
8B
•
Updated
Apr 16
•
2.82k
•
19
OS-Copilot/OS-Genesis-4B-WA
Image-Text-to-Text
•
4B
•
Updated
May 5
•
6
•
1
TianHuiLab/Falcon-Single-Instruction-Large
Image-Text-to-Text
•
Updated
Mar 21
•
8
ByteDance/Sa2VA-4B
Image-Text-to-Text
•
4B
•
Updated
Mar 19
•
3.28k
•
•
79
ByteDance/Sa2VA-1B
Image-Text-to-Text
•
1B
•
Updated
Mar 19
•
800
•
25
5CD-AI/Vintern-1B-v3_5
Image-Text-to-Text
•
0.9B
•
Updated
29 days ago
•
381k
•
80
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text
•
456B
•
Updated
24 days ago
•
28.1k
•
276
TucanoBR/ViTucano-2b8-v1
Image-Text-to-Text
•
3B
•
Updated
3 days ago
•
85
•
7
ByteDance-Seed/UI-TARS-7B-SFT
Image-Text-to-Text
•
8B
•
Updated
Jan 25
•
14.4k
•
176
OpenVINO/Phi-3.5-vision-instruct-int8-ov
Image-Text-to-Text
•
Updated
Mar 18
•
1.75k
•
1
ByteDance-Seed/UI-TARS-7B-DPO
Image-Text-to-Text
•
8B
•
Updated
Jan 25
•
3.67k
•
219
ByteDance-Seed/UI-TARS-72B-DPO
Image-Text-to-Text
•
73B
•
Updated
Jan 25
•
1.77k
•
134
lmstudio-community/UI-TARS-72B-DPO-GGUF
Image-Text-to-Text
•
73B
•
Updated
Jan 23
•
60
•
3
jarvisvasu/Qwen2.5-VL-3B-Instruct-4bit
Image-Text-to-Text
•
2B
•
Updated
Jan 29
•
865
•
4
AIDC-AI/Ovis2-16B
Image-Text-to-Text
•
16B
•
Updated
Feb 27
•
27.3k
•
99
Fancy-MLLM/R1-Onevision-7B
Image-Text-to-Text
•
8B
•
Updated
Feb 25
•
1.28k
•
41
Qwen/Qwen2.5-VL-3B-Instruct-AWQ
Image-Text-to-Text
•
1B
•
Updated
Apr 6
•
50.9k
•
46
hustvl/mmMamba-linear
Image-Text-to-Text
•
3B
•
Updated
Feb 26
•
332
•
4
Qwen/Qwen2.5-VL-7B-Instruct-AWQ
Image-Text-to-Text
•
3B
•
Updated
Apr 6
•
255k
•
81
huihui-ai/Qwen2.5-VL-7B-Instruct-abliterated
Image-Text-to-Text
•
8B
•
Updated
Apr 1
•
1.73k
•
18
rp-yu/Qwen2-VL-2b-VPT-Seg
Image-Text-to-Text
•
3B
•
Updated
13 days ago
•
17
•
1
microsoft/Magma-8B
Image-Text-to-Text
•
9B
•
Updated
May 13
•
13k
•
402
NAMAA-Space/Qari-OCR-0.1-VL-2B-Instruct
Image-Text-to-Text
•
Updated
Jun 10
•
1.25k
•
33
CohereLabs/aya-vision-8b
Image-Text-to-Text
•
9B
•
Updated
30 days ago
•
24.8k
•
•
302
ggml-org/gemma-3-27b-it-GGUF
Image-Text-to-Text
•
27B
•
Updated
May 21
•
2.17k
•
22
unsloth/gemma-3-12b-pt
Image-Text-to-Text
•
12B
•
Updated
Jun 3
•
29.8k
•
5
bartowski/google_gemma-3-4b-it-GGUF
Image-Text-to-Text
•
4B
•
Updated
Mar 22
•
11.4k
•
26
lmstudio-community/gemma-3-12b-it-GGUF
Image-Text-to-Text
•
12B
•
Updated
Mar 12
•
297k
•
28
DevQuasar/google.gemma-3-4b-pt-GGUF
Image-Text-to-Text
•
4B
•
Updated
Mar 12
•
141
•
1
Previous
1
...
4
5
6
7
8
...
100
Next