FAMA Collection The First Large-Scale Open-Science Speech Foundation Model for English and Italian • 5 items • Updated May 30 • 10
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published Jan 14 • 64 • 3
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29, 2024 • 54
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published Aug 29, 2024 • 52
Visualization: the missing factor in Simultaneous Speech Translation Paper • 2111.00514 • Published Oct 31, 2021 • 1
Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation Paper • 2206.05807 • Published Jun 12, 2022 • 1
Token-Level Serialized Output Training for Joint Streaming ASR and ST Leveraging Textual Alignments Paper • 2307.03354 • Published Jul 7, 2023 • 1
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP Paper • 2303.16166 • Published Mar 28, 2023 • 1