YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Ministral-3-3B-Instruct-2512

Run Ministral-3-3B on Qualcomm NPU with NexaSDK.

Quickstart

Install nexaSDK and create a free account at sdk.nexa.ai

Activate your device with your access token:

nexa config set license '<access_token>'

Run the model locally in one line:

nexa infer NexaAI/Ministral-3-3B-npu

Model Description

Ministral-3-3B-Instruct-2512 is the instruction-tuned variant of Mistral AI’s smallest Ministral 3 model: a compact multimodal language model combining a ~3.4B-parameter language core with a 0.4B-parameter vision encoder.
It is post-trained in FP8 for instruction-following, making it well-suited for chat-style agents, tool use, and grounded reasoning on both text and images.
With a large 256k context window and efficient edge-oriented design, it targets real-time use on GPUs and other resource-constrained hardware.

Features

  • Multimodal (vision + text): Understands and reasons over images alongside text in a single conversation.
  • Instruction-tuned: Optimized for following natural-language instructions, chat, and assistant-style workflows.
  • Agentic capabilities: Native support for function calling and structured JSON-style outputs for tool and API orchestration.
  • Large context window: Up to 256k tokens for long documents, multi-step workflows, and complex sessions.
  • Edge-optimized FP8 weights: FP8 checkpoint designed for efficient deployment and serving, including on a single modern GPU.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, and Arabic.
  • Part of the Ministral 3 family: Seamlessly aligned with 3B/8B/14B base, instruct, and reasoning variants for scalable deployments.

Use Cases

  • Vision + language assistants
    • Image captioning and explanation (UI screenshots, photos, diagrams)
    • Multimodal Q&A (e.g., “describe this chart and summarize its implications”)
  • Lightweight agents and tools
    • Function-calling workflows (retrieval, calculators, external APIs)
    • JSON-structured responses for downstream automation
  • Text understanding & generation
    • Classification, tagging, routing, and extraction from long documents
    • Short-form copywriting, drafting, and rewriting across multiple languages
  • Edge & low-resource deployments
    • On-device or near-edge assistants where latency, context length, and cost matter
    • Local/private workloads that benefit from a small yet capable multimodal model

Inputs and Outputs

Inputs

  • Text-only prompts
    • Single-turn or multi-turn chat-style conversations (system, user, assistant roles).
    • Long-context inputs up to the model’s context limit (e.g., documents, logs, transcripts).
  • Multimodal prompts
    • One or more images (e.g., URLs or image tensors) combined with text.
  • Structured tool schemas
    • Function / tool definitions for agentic workflows (JSON schemas describing functions and parameters).

Outputs

  • Generated text
    • Answers, explanations, step-by-step reasoning, summaries, and creative content.
  • Multimodal-aware responses
    • Text grounded in the provided images (descriptions, comparisons, localized details).
  • Structured tool calls
    • JSON-like tool call objects for function execution and programmatic integration.
  • Logits / probabilities (advanced)
    • For users accessing the raw model via low-level APIs, token-level scores for custom decoding or research.

License

This repo is licensed under the Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0) license, which allows use, sharing, and modification only for non-commercial purposes with proper attribution. All NPU-related models, runtimes, and code in this project are protected under this non-commercial license and cannot be used in any commercial or revenue-generating applications. Commercial licensing or enterprise usage requires a separate agreement. For inquiries, please contact dev@nexa.ai

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/Ministral-3-3B-npu