File size: 8,583 Bytes
be3d96d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13efb82
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title>Top Open-Source Small Language Models</title>
    <link rel="stylesheet" href="styles.css"/>
</head>
<body>

<h1>Top Open-Source Small Language Models for Generative AI Applications</h1>

<p>
    Small Language Models (SLMs) are language models that contain, at most, a few billion parametersβ€”significantly fewer
    than Large Language Models (LLMs), which can have tens, hundreds of billions, or even trillions, of parameters. SLMs
    are well-suited for resource-constrained environments, as well as on-device and real-time generative AI
    applications. Many of them can run locally on a laptop using tools like LM Studio or Ollama . These models are
    typically derived from larger models using techniques such as quantization and distillation. In the following, some
    well developed SLMs are introduced.
</p>
<p>
    Note: All the models mentioned here are open source. However, for details regarding experimental use, commercial
    use, redistribution, and other terms, please refer to the license documentation.
</p>

<h2>Phi 4 Collection by Microsoft</h2>
<p>
    This Collection features a range of small language models, including reasoning models, ONNX- and GGUF-compatible
    formats, and multimodal models. The base model in the collection has 14 billion parameters, while the smallest
    models have 3.84 billion. Strategic use of synthetic data during training has led to improved performance compared
    to its mother model (primarily GPT-4). Currently, the collection includes three versions of reasoning-focused SLMs,
    making it one of the best solutions for reasoning tasks.
</p>
<p>
    πŸ‘‰ Licence: <a href="https://choosealicense.com/licenses/mit/" target="_blank">MIT</a><br>
    πŸ‘‰ <a href="https://huggingface.co/collections/microsoft/phi-4-677e9380e514feb5577a40e4" target="_blank">Collection
    on Hugging Face</a><br>
    πŸ‘‰ <a href="https://arxiv.org/abs/2412.08905" target="_blank">Technical Report</a>
</p>

<h2>Gemma 3 Collection by Google</h2>
<p>
    This collection features multiple versions, including Image-to-Text, Text-to-Text, and Image-and-Text-to-Text
    models, available in both quantized and GGUF formats. The models vary in size, with 1, 4.3, 12.2, and 27.4 billion
    parameters. Two specialized variants have been developed for specific applications: TxGemma, optimized for
    therapeutic development, and ShieldGemma, designed for moderating text and image content.
</p>
<p>
    πŸ‘‰ Licence: <a href="https://ai.google.dev/gemma/terms" target="_blank">Gemma</a><br>
    πŸ‘‰ <a href="https://huggingface.co/collections/google/gemma-3-release-67c6c6f89c4f76621268bb6d" target="_blank">Collection
    on Hugging Face</a><br>
    πŸ‘‰ <a href="https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf" target="_blank">Technical
    Report</a><br>
    πŸ‘‰ <a href="https://huggingface.co/collections/google/shieldgemma-67d130ef8da6af884072a789" target="_blank">ShieldGemma
    on Hugging Face</a><br>
    πŸ‘‰ <a href="https://huggingface.co/collections/google/txgemma-release-67dd92e931c857d15e4d1e87" target="_blank">TxGemma
    on Hugging Face</a>
</p>

<h2>Mistral Models</h2>
<p>
    Mistral AI is a France-based AI startup and one of the pioneers in releasing open-source language models. Its
    current product lineup includes three compact models: Mistral Small 3.1, Pixtral 12B, and Mistral NEMO. All of them
    are released under <a href="https://www.apache.org/licenses/LICENSE-2.0" target="_blank">Apache 2.0 license</a>.
</p>

<p>
    <b>Mistral 3.1</b> is a multimodal and multilingual SLM having 24 billion parameters and 128K context window.
    Currently there are two versions: Base and Instruct.<br>
    πŸ‘‰ <a href="https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503" target="_blank">Base Version on Hugging
    Face</a><br>
    πŸ‘‰ <a href="https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503" target="_blank">Instruct Version on
    Hugging Face</a><br>
    πŸ‘‰ <a href="https://mistral.ai/news/mistral-small-3-1" target="_blank">Technical Report</a><br>
</p>

<p>
    <b>Pixtral 12B</b> is a natively multimodal model trained on interleaved image and text data, delivering strong
    performance on multimodal tasks and instruction following while maintaining state-of-the-art results on text-only
    benchmarks. It features a newly developed 400M parameter vision encoder and a 12B parameter multimodal decoder based
    on Mistral NEMO. The model supports variable image sizes, aspect ratios, and multiple images within a long context
    window of up to 128k tokens.<br>
    πŸ‘‰ <a href="https://huggingface.co/mistralai/Pixtral-12B-Base-2409" target="_blank">Pixtral-12B-Base-2409 on Hugging
    Face</a><br>
    πŸ‘‰ <a href="https://huggingface.co/mistralai/Pixtral-12B-2409" target="_blank">Pixtral-12B-2409 on Hugging
    Face</a><br>
    πŸ‘‰ <a href="https://mistral.ai/news/pixtral-12b" target="_blank">Technical Report</a><br>
</p>

<p>
    <b>Mistral NeMo</b> is a 12B model developed in collaboration with NVIDIA, featuring a large 128k-token context
    window and state-of-the-art reasoning, knowledge, and coding accuracy for its size.<br>
    πŸ‘‰ <a href="https://huggingface.co/mistralai/Mistral-Nemo-Instruct-FP8-2407" target="_blank">Model on Hugging
    Face</a><br>
    πŸ‘‰ <a href="https://mistral.ai/news/mistral-nemo" target="_blank">Technical Report</a>
</p>

<h2>Llama Models by Meta</h2>
<p>
    Meta is one of the leading contributors to open-source AI. In recent years, it has released several versions of its
    Llama models. The latest series is Llama 4, although all models in this collection are currently quite large.
    Smaller models may be introduced in the future or in upcoming sub-versions of Llama 4, but for now, that hasn’t
    happened. The most recent collection that includes smaller models is Llama 3.2. It features models with 1.24 billion
    and 3.21 billion parameters with 128k context windows. Additionally, there is a 10.6 billion-parameter multimodal
    version designed for Image-and-Text-to-Text tasks.
    This collection includes small variants of Llama Guard β€” fine-tuned language models designed for prompt and response
    classification. They can detect unsafe prompts and responses, making them useful for implementing safety measures in
    LLM-based applications.
</p>
<p>
    πŸ‘‰ License: <a href="https://www.llama.com/llama3_2/license/" target="_blank">LLAMA 3.2 COMMUNITY LICENSE
    AGREEMENT</a><br>
    πŸ‘‰ <a href="https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf" target="_blank">Collection
    on Hugging Face</a><br>
    πŸ‘‰ <a href="https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/" target="_blank">Technical
    Paper</a>
</p>

<h2>Qwen 3 Collection by Alibaba</h2>
<p>
    The Chinese tech giant Alibaba is another major player in open-source AI. It releases its language models under the
    Qwen name. The latest version is Qwen 3, which includes both small and large models. The smaller models range in
    size, with parameter counts of 14.8 billion, 8.19 billion, 4.02 billion, 2.03 billion, and even 752 million. This
    collection also includes quantized and GGUF formats.
</p>
<p>
    πŸ‘‰ Licence: <a href="https://www.apache.org/licenses/LICENSE-2.0" target="_blank">Apache 2.0</a><br>
    πŸ‘‰ <a href="https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f" target="_blank">Collection on
    Hugging Face</a><br>
    πŸ‘‰ <a href="https://github.com/QwenLM/Qwen3/blob/main/Qwen3_Technical_Report.pdf" target="_blank">Technical
    Report</a>
</p>

<hr style="border: none; height: 1px; background-color: #ccc;">

<p>This list is not limited to these five. You can explore more open-source models at:</p>
<ul>
    <li><a href="https://huggingface.co/databricks" target="_blank">Databricks</a></li>
    <li><a href="https://huggingface.co/Cohere" target="_blank">Cohere</a></li>
    <li><a href="https://huggingface.co/deepseek-ai" target="_blank">Deepseek</a></li>
    <li><a href="https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966" target="_blank">SmolLM</a>
    </li>
    <li><a href="https://huggingface.co/stabilityai" target="_blank">Stability AI</a></li>
    <li><a href="https://huggingface.co/ibm-granite" target="_blank">IBM Granite</a></li>
</ul>

</body>
</html>