moondream2
a tiny vision language model
a tiny vision language model
Translate text between 200 languages
Extract and refine foreground from images
Generate images from text prompts
Generate animated videos from text prompts
Solo Piano Audio to MIDI Transcription
Transform empty rooms into designed spaces based on text prompts
Stable Diffusion Finetuned Version
Enhance and upscale images with clarity
Text to Audio (Sound SFX) Generator
Analyze images to identify tags and ratings
Track, rank and evaluate open Arabic LLMs and chatbots
Chat with an AI that understands text and images
Vocal and background audio separator
Calculate VRAM requirements for LLM models
Scrape and summarize web content
olmocr / nanonets ocr / qwen2vl ocr / aya vision / rolmocr
Explore model performance with interactive leaderboards
Transcribe voice to text
Example of a Key Performance Indicator (KPI) dashboard
Analyze images to generate captions, detect objects, or perform OCR
Generate sound effects for silent videos
Upscale images to x4
Enhance images with high-resolution quality and HDR effects