|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
language: |
|
- it |
|
tags: |
|
- fill-mask |
|
- masked-lm |
|
- long-context |
|
- modernbert |
|
- italian |
|
pipeline_tag: fill-mask |
|
--- |
|
|
|
# Italian ModernBERT |
|
|
|
--- |
|
|
|
**💡 Found this resource helpful?** Creating and maintaining open source AI models and datasets requires significant computational resources. If this work has been valuable to you, consider [supporting my research](https://buymeacoffee.com/michele.montebovi) to help me continue building tools that benefit the entire AI community. Every contribution directly funds more open source innovation! ☕ |
|
|
|
--- |
|
|
|
## Model Description |
|
|
|
Italian ModernBERT (DeepMount00/Italian-ModernBERT-base) is a specialized Italian language version of ModernBERT, specifically pre-trained on Italian text corpora. This model is designed exclusively for Italian language tasks. |
|
|
|
## Key Features |
|
|
|
- **Architecture**: Based on ModernBERT-base (22 layers, 149M parameters) |
|
- **Context Length**: 8,192 tokens |
|
- **Language**: Italian-only |
|
- **Tokenizer**: Custom tokenizer optimized for Italian language |
|
- **Training**: Pre-trained on Italian text corpus |
|
|
|
## Technical Details |
|
|
|
- Uses Rotary Positional Embeddings (RoPE) |
|
- Implements Local-Global Alternating Attention |
|
- Supports Flash Attention 2 for optimal performance |
|
- No token type IDs required |
|
|
|
|
|
## Limitations |
|
|
|
- Optimized only for Italian language processing |
|
- Not suitable for other languages |
|
- May reflect biases present in training data |
|
|