MM Grounding DINO Collection See: https://github.com/huggingface/transformers/pull/37925 • 8 items • Updated Jun 26 • 3
view article Article Mixture of Experts Explained By osanseviero and 5 others • Dec 11, 2023 • 809
view article Article Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • 8 days ago • 61
view article Article Bamba: Inference-Efficient Hybrid Mamba2 Model By rganti and 28 others • Dec 18, 2024 • 58
Meta CLIP Collection Scaling CLIP data with transparent training distribution from an end-to-end pipeline. • 7 items • Updated 18 days ago • 4
ARPO Collection The official datasets and model checkpoints of ARPO • 9 items • Updated 10 days ago • 3
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26 • 64
Medical & Clinical NER Collection State-of-the-art medical, biomedical, and clinical Named Entity Recognition models • 389 items • Updated 21 days ago • 24
view article Article Introducing ColQwen-Omni: Retrieve in every modality By manu and 4 others • 22 days ago • 63
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published 22 days ago • 70
VisionThink Collection Efficient Reasoning Vision Language Model • 7 items • Updated 21 days ago • 5
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models By nvidia and 3 others • 21 days ago • 47
Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models Paper • 2507.07104 • Published 30 days ago • 44
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 2 items • Updated 27 days ago • 115
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 7 items • Updated 28 days ago • 272
view article Article Upskill your LLMs with Gradio MCP Servers By freddyaboulton • about 1 month ago • 18
QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation Paper • 2506.02295 • Published Jun 2 • 5
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • about 1 month ago • 638