File size: 2,686 Bytes
8c43281 4d21860 d1813b5 8c43281 4457428 8c43281 84dac57 955a869 b512732 955a869 a174bbb 7a360e1 955a869 b512732 7a360e1 955a869 b512732 7a360e1 955a869 b512732 7a360e1 955a869 b512732 7a360e1 955a869 b512732 7a360e1 9de9507 7a360e1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
---
license: apache-2.0
language:
- it
- en
pipeline_tag: text-generation
datasets:
- DeepMount00/o1-ITA-REASONING
- DeepMount00/GPT-4o-ITA-INSTRUCT
- DeepMount00/Sonnet-3.5-ITA-INSTRUCT
- DeepMount00/open-perfectblend-ita
- HuggingFaceTB/cosmopedia
- DeepMount00/pretraining_multi
---
---
**๐ก Found this resource helpful?** Creating and maintaining open source AI models and datasets requires significant computational resources. If this work has been valuable to you, consider [supporting my research](https://buymeacoffee.com/michele.montebovi) to help me continue building tools that benefit the entire AI community. Every contribution directly funds more open source innovation! โ
---
<p align="center">
<img src="alireo.webp" style="width: 500px; height:500px;"/>
</p>
<h2 style="font-size: 32px; text-align: center;">Alireo-400M ๐ค ๐ฎ๐น</h2>
<p style="font-size: 21px; text-align: center;">A Lightweight Italian Language Model</p>
<h3 style="font-size: 21px; color: #2980b9;">Model Description ๐</h3>
Alireo-400M is a lightweight yet powerful Italian language model with 400M parameters, designed to provide efficient natural language processing capabilities while maintaining a smaller footprint compared to larger models.
<h3 style="font-size: 21px; color: #2980b9;">Key Features โจ</h3>
* **Architecture**: Transformer-based language model ๐๏ธ
* **Parameters**: 400M ๐
* **Context Window**: 8K tokens ๐ช
* **Training Data**: Curated Italian text corpus (books, articles, web content) ๐
* **Model Size**: ~800MB ๐พ
<h3 style="font-size: 21px; color: #2980b9;">Performance ๐</h3>
Despite its compact size, Alireo-400M demonstrates impressive performance:
* **Benchmark Results**: Outperforms Qwen 0.5B across multiple benchmarks ๐
* **Language Understanding**: Maintains high accuracy in Italian language understanding tasks ๐ฏ
* **Speed**: Efficient inference speed due to optimized architecture โก
<h3 style="font-size: 21px; color: #2980b9;">Limitations โ ๏ธ</h3>
* Limited context window compared to larger models
* May struggle with highly specialized technical content
* Performance may vary on dialectal variations
* Not suitable for multilingual tasks
<h3 style="font-size: 21px; color: #2980b9;">Hardware Requirements ๐ป</h3>
* **Minimum RAM**: 2GB
* **Recommended RAM**: 4GB
* **GPU**: Optional, but recommended for faster inference
* **Disk Space**: ~1GB (including model and dependencies)
<h3 style="font-size: 21px; color: #2980b9;">Citation ๐</h3>
```bibtex
@software{alireo2024,
author = {[Michele Montebovi]},
title = {Alireo-400M: A Lightweight Italian Language Model},
year = {2024},
}
``` |