Wav2Vec2-NL

A Dutch Wav2Vec2-base model, pre-trained on 960 hours of exclusively Dutch speech.

Pre-training data was extracted from a combination of:

More information, incl. the training manifest and configuration is available in the Wav2Vec2-NL repository on Zenodo.

Analyses of Dutch phonetic and lexical features encoded in Wav2Vec2-NL hidden states are reported in the paper What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training (Interspeech 2025; see full citation below).

Note: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data. Check out this blog for an explanation of fine-tuning Wav2Vec2 models on HuggingFace.

Usage

from transformers import Wav2Vec2Model

feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained('amsterdamNLP/Wav2Vec2-NL')
model = Wav2Vec2Model.from_pretrained('amsterdamNLP/Wav2Vec2-NL')

Citation

The Wav2Vec2-NL model was published as part of: de Heer Kloots, M., Mohebbi, H., Pouw, C., Shen, G., Zuidema, W., Bentum, M. (2025). What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training. Proc. INTERSPEECH 2025. https://doi.org/10.21437/Interspeech.2025-1526

BibTex entry:

@inproceedings{deheerkloots25_interspeech,
  title     = {What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training},
  author    = {Marianne {de Heer Kloots} and Hosein Mohebbi and Charlotte Pouw and Gaofei Shen and Willem Zuidema and Martijn Bentum},
  year      = {2025},
  booktitle = {Interspeech 2025},
  doi       = {10.21437/Interspeech.2025-1526},
}
Downloads last month
21
Safetensors
Model size
95M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support