Better Late Than Never: Evaluation of Latency Metrics for Simultaneous Speech-to-Text Translation Paper • 2509.17349 • Published Sep 22 • 2
Cross-Attention is Half Explanation in Speech-to-Text Models Paper • 2509.18010 • Published Sep 22 • 6
Cross-Attention is Half Explanation in Speech-to-Text Models Paper • 2509.18010 • Published Sep 22 • 6
MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks Paper • 2507.19634 • Published Jul 25 • 9
FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian Paper • 2505.22759 • Published May 28 • 19
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus Paper • 2310.05294 • Published Oct 8, 2023
Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection Paper • 2310.15752 • Published Oct 24, 2023
A Prompt Response to the Demand for Automatic Gender-Neutral Translation Paper • 2402.06041 • Published Feb 8, 2024
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing? Paper • 2402.12025 • Published Feb 19, 2024 • 2
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena Paper • 2402.13208 • Published Feb 20, 2024
Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models Paper • 2405.08477 • Published May 14, 2024
SBAAM! Eliminating Transcript Dependency in Automatic Subtitling Paper • 2405.10741 • Published May 17, 2024 • 1
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection Paper • 2406.06097 • Published Jun 10, 2024 • 2
Gender Neutralization for an Inclusive Machine Translation: from Theoretical Foundations to Open Challenges Paper • 2301.10075 • Published Jan 24, 2023
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation Paper • 2406.14177 • Published Jun 20, 2024 • 1
How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not Paper • 2409.17044 • Published Sep 25, 2024 • 3
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages Paper • 2410.01036 • Published Oct 1, 2024 • 16
mGeNTE: A Multilingual Resource for Gender-Neutral Language and Translation Paper • 2501.09409 • Published Jan 16 • 1