mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition Paper โข 2502.01547 โข Published Feb 3, 2025
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM? Paper โข 2505.09439 โข Published May 14, 2025 โข 9 โข 2
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation Paper โข 2406.10082 โข Published Jun 14, 2024 โข 1
ibm-granite/granite-speech-3.3-8b Automatic Speech Recognition โข 9B โข Updated Aug 19, 2025 โข 27k โข 151
saurabhati/DASS_small_AudioSet_47.2 Audio Classification โข 29.9M โข Updated Mar 31, 2025 โข 4 โข 1
voidful/wav2vec2-xlsr-multilingual-56 Automatic Speech Recognition โข 0.3B โข Updated Mar 18, 2023 โข 8.19k โข 33
Running 97 The ๐ค Speech Bench ๐ 97 Find speech recognition models for a specific language and dataset