Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
samsam55 's Collections
Run on CPU Optimizations
Deep Search
World View Creation (out painting 3D)
Computer Use
Coding LLMs
Visual Multi Modal LLM
TTS & Speech to Text
Misc
Agents
3D Models & Modeling

Visual Multi Modal LLM

updated about 16 hours ago
Upvote
-

  • NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

    Paper • 2510.08565 • Published 9 days ago • 19

  • Detect Anything via Next Point Prediction

    Paper • 2510.12798 • Published 4 days ago • 38

  • PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

    Paper • 2510.14528 • Published 2 days ago • 28
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs