Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Susant-Achary
's Collections
🛩️Qwen3-VL
<7B Best of MoE 🧠
🍎 MLX-Quantized Models (3/4/5/6-bit) Mac & iOS
Audio Features
🖼️ Vision Backbones & Image Embeddings
Feature Extraction with 🧠 Text Embeddings
🧊Sept 25 <Image-to-3D> [Top Releases]
🪶 Sept’25 <Text Generation Language Models >(Top Releases)
🎬 ✍️ Sept 25 <Video & Text2Video> (Top Releases)
🖼️ **Text2Image, i2i ** September ’25 (Top Releases)
Top Apache 2.0 License
📄➡️🔊 Text-to-Speech (TTS)
✍️➡️🎬 Text-to-Video
📚➡️🎨Text-to-Image
🖌️ Image-to-Image
🎨➡️✍️ Image-to-Text
🖼️➡️📚 Image-Text-to-Text
🌀 Any-to-Any Multimodal Models
✍️ Text Generation
👨💻Mathematical Reasoning 🧮
🧠General Purpose Dataset < 10M samples
🧩 Long-Context Models (≥128k) CODING
🍎 MLX-Ready LLMs
🧩 Long-Context Models (≥128k) under 8B
📱 OnDevice -Ready SLMs (≤4B)
Qwen3
GPT2-JungleBook-from-Scratch-Models
🎨➡️✍️ Image-to-Text
updated
20 days ago
OCR, captioning, and visual QA models that turn pure images into descriptive or structured text.
Upvote
-
Salesforce/blip-image-captioning-base
Image-to-Text
•
Updated
Feb 3
•
1.99M
•
795
Salesforce/blip-image-captioning-large
Image-to-Text
•
0.5B
•
Updated
Feb 3
•
1.12M
•
1.42k
nlpconnect/vit-gpt2-image-captioning
Image-to-Text
•
Updated
Feb 27, 2023
•
1.25M
•
916
microsoft/trocr-base-handwritten
Image-to-Text
•
0.3B
•
Updated
Feb 11
•
334k
•
450
PaddlePaddle/PP-OCRv5_server_det
Image-to-Text
•
Updated
Jul 22
•
196k
•
43
breezedeus/pix2text-mfr
Image-to-Text
•
Updated
May 5, 2024
•
140k
•
43
Upvote
-
Share collection
View history
Collection guide
Browse collections