File size: 2,686 Bytes
8c43281
 
 
 
 
 
4d21860
 
 
 
 
d1813b5
 
8c43281
 
4457428
 
 
 
 
 
8c43281
 
 
 
84dac57
955a869
b512732
955a869
a174bbb
7a360e1
 
955a869
b512732
7a360e1
 
 
 
 
 
955a869
b512732
7a360e1
 
 
 
 
 
955a869
b512732
7a360e1
 
 
 
 
955a869
b512732
7a360e1
 
 
 
 
955a869
b512732
7a360e1
 
9de9507
 
 
7a360e1
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
license: apache-2.0
language:
- it
- en
pipeline_tag: text-generation
datasets:
- DeepMount00/o1-ITA-REASONING
- DeepMount00/GPT-4o-ITA-INSTRUCT
- DeepMount00/Sonnet-3.5-ITA-INSTRUCT
- DeepMount00/open-perfectblend-ita
- HuggingFaceTB/cosmopedia
- DeepMount00/pretraining_multi
---

---

**๐Ÿ’ก Found this resource helpful?** Creating and maintaining open source AI models and datasets requires significant computational resources. If this work has been valuable to you, consider [supporting my research](https://buymeacoffee.com/michele.montebovi) to help me continue building tools that benefit the entire AI community. Every contribution directly funds more open source innovation! โ˜•

---

<p align="center">
  <img src="alireo.webp" style="width: 500px; height:500px;"/>
</p>

<h2 style="font-size: 32px; text-align: center;">Alireo-400M ๐Ÿค– ๐Ÿ‡ฎ๐Ÿ‡น</h2>
<p style="font-size: 21px; text-align: center;">A Lightweight Italian Language Model</p>

<h3 style="font-size: 21px; color: #2980b9;">Model Description ๐Ÿ“</h3>

Alireo-400M is a lightweight yet powerful Italian language model with 400M parameters, designed to provide efficient natural language processing capabilities while maintaining a smaller footprint compared to larger models.

<h3 style="font-size: 21px; color: #2980b9;">Key Features โœจ</h3>

* **Architecture**: Transformer-based language model ๐Ÿ—๏ธ
* **Parameters**: 400M ๐Ÿ“Š
* **Context Window**: 8K tokens ๐ŸชŸ
* **Training Data**: Curated Italian text corpus (books, articles, web content) ๐Ÿ“š
* **Model Size**: ~800MB ๐Ÿ’พ

<h3 style="font-size: 21px; color: #2980b9;">Performance ๐Ÿ“ˆ</h3>

Despite its compact size, Alireo-400M demonstrates impressive performance:

* **Benchmark Results**: Outperforms Qwen 0.5B across multiple benchmarks ๐Ÿ†
* **Language Understanding**: Maintains high accuracy in Italian language understanding tasks ๐ŸŽฏ
* **Speed**: Efficient inference speed due to optimized architecture โšก

<h3 style="font-size: 21px; color: #2980b9;">Limitations โš ๏ธ</h3>

* Limited context window compared to larger models
* May struggle with highly specialized technical content
* Performance may vary on dialectal variations
* Not suitable for multilingual tasks

<h3 style="font-size: 21px; color: #2980b9;">Hardware Requirements ๐Ÿ’ป</h3>

* **Minimum RAM**: 2GB
* **Recommended RAM**: 4GB
* **GPU**: Optional, but recommended for faster inference
* **Disk Space**: ~1GB (including model and dependencies)

<h3 style="font-size: 21px; color: #2980b9;">Citation ๐Ÿ“„</h3>

```bibtex
@software{alireo2024,
  author = {[Michele Montebovi]},
  title = {Alireo-400M: A Lightweight Italian Language Model},
  year = {2024},
}
```