Whisper Base French LoRA

A LoRA (Low-Rank Adaptation) fine-tuned adapter for openai/whisper-base optimized for French speech recognition.

This adapter was specifically designed for use with WhisperLiveKit, providing ultra-low-latency French transcription.

Model Details

Property Value
Base Model openai/whisper-base (74M params)
Adapter Type LoRA (PEFT)
Trainable Parameters 2.4M (3.2% of base)
Language French (fr)
Task Transcription

LoRA Configuration

LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    target_modules=["q_proj", "k_proj", "v_proj", "out_proj"]
)

Performance

Comparison with Baseline

Split Model WER โ†“ CER โ†“
Validation Whisper Base (baseline) 36.94% 15.62%
Validation + This LoRA 28.06% 10.06%
Test Whisper Base (baseline) 60.47% 31.63%
Test + This LoRA 39.30% 17.39%

Improvement Summary

Split WER Reduction CER Reduction
Validation -8.88 pts (24% relative) -5.56 pts (36% relative)
Test -21.17 pts (35% relative) -14.24 pts (45% relative)

Usage

With WhisperLiveKit (Recommended)

The easiest way to use this model is with WhisperLiveKit for real-time French transcription:

pip install whisperlivekit

# Start the server with French LoRA (auto-downloads from HuggingFace)
wlk --model base --language fr --lora-path qfuxa/whisper-base-french-lora

The adapter is automatically downloaded and cached from HuggingFace Hub on first use.

With Transformers + PEFT

from transformers import WhisperForConditionalGeneration, WhisperProcessor
from peft import PeftModel
import torch

# Load base model
base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-base")
processor = WhisperProcessor.from_pretrained("openai/whisper-base", language="fr", task="transcribe")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "qfuxa/whisper-base-french-lora")
model = model.merge_and_unload()  # Optional: merge for faster inference

# Transcribe
audio = processor.feature_extractor(audio_array, sampling_rate=16000, return_tensors="pt")
generated_ids = model.generate(audio.input_features, language="fr", task="transcribe")
transcription = processor.tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

With Native Whisper (WhisperLiveKit Backend)

from whisperlivekit.whisper import load_model

# Load Whisper base with French LoRA adapter
model = load_model(
    "base",
    lora_path="path/to/whisper-base-french-lora"
)

# Transcribe
result = model.transcribe(audio, language="fr")

Training Details

Dataset

  • Source: Mozilla Common Voice v23.0 French
  • Training samples: 100,000
  • Validation samples: 2,000
  • Test samples: 2,000

Training Configuration

Parameter Value
Epochs 5
Effective batch size 128 (16 ร— 8 accumulation)
Learning rate 3e-4
Warmup steps 100
Weight decay 0.01
Optimizer AdamW
Early stopping 5 evaluations patience

Hardware

  • Trained on Apple Silicon (MPS)

Limitations

  • Optimized specifically for French; may not generalize well to other languages
  • Based on whisper-base (74M params) โ€” consider larger models for higher accuracy
  • Performance may vary on domain-specific audio (medical, legal, technical)
  • Trained on crowd-sourced Common Voice data; may have biases toward certain accents

Citation

If you use this model, please cite:

@misc{whisper-base-french-lora,
  author = {Quentin Fuxa},
  title = {Whisper Base French LoRA},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/qfuxa/whisper-base-french-lora}
}

@misc{whisperlivekit,
  author = {Quentin Fuxa},
  title = {WhisperLiveKit: Ultra-low-latency speech-to-text},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/QuentinFuxa/WhisperLiveKit}
}

License

Apache 2.0 โ€” same as the base Whisper model.

Acknowledgments

Downloads last month
33
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for qfuxa/whisper-base-french-lora

Adapter
(40)
this model

Evaluation results