CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following Paper • 2506.12285 • Published Jun 14, 2025 • 54
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens Paper • 2503.01710 • Published Mar 3, 2025 • 6
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information Paper • 2503.05085 • Published Mar 7, 2025 • 47
pyannote/speaker-diarization-3.1 Automatic Speech Recognition • Updated May 10, 2024 • 13.1M • 1.48k