MultiMedST_icon

MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation

Paper Dataset Models License Stars

πŸ“˜ EMNLP 2025

Khai Le-Duc*, Tuyen Tran*, Bach Phan Tat, Nguyen Kim Hai Bui, Quan Dang, Hung-Phong Tran, Thanh-Thuy Nguyen, Ly Nguyen, Tuan-Minh Phan, Thi Thu Phuong Tran, Chris Ngo, Nguyen X. Khanh**, Thanh Nguyen-Tang**

*Equal contribution   |   **Equal supervision


⭐ If you find this work useful, please consider starring the repo and citing our paper!


🧠 Abstract

Multilingual speech translation (ST) in the medical domain enhances patient care by enabling effective communication across language barriers, alleviating workforce shortages, and improving diagnosis and treatment β€” especially in global health emergencies.

In this work, we introduce MultiMed-ST, the first large-scale multilingual medical speech translation dataset, spanning all translation directions across five languages:
πŸ‡»πŸ‡³ Vietnamese, πŸ‡¬πŸ‡§ English, πŸ‡©πŸ‡ͺ German, πŸ‡«πŸ‡· French, πŸ‡¨πŸ‡³ Traditional & Simplified Chinese.

With 290,000 samples, MultiMed-ST represents:

  • 🧩 the largest medical MT dataset to date
  • 🌐 the largest many-to-many multilingual ST dataset across all domains

We also conduct the most comprehensive ST analysis in the field's history, to our best knowledge, covering:

  • βœ… Empirical baselines
  • πŸ”„ Bilingual vs. multilingual study
  • 🧩 End-to-end vs. cascaded models
  • 🎯 Task-specific vs. multi-task seq2seq approaches
  • πŸ—£οΈ Code-switching analysis
  • πŸ“Š Quantitative & qualitative error analysis

All code, data, and models are publicly available: πŸ‘‰ GitHub Repository

poster_MultiMed-ST_EMNLP2025


🧰 Repository Overview

This repository provides scripts for:

  • πŸŽ™οΈ Automatic Speech Recognition (ASR)
  • 🌍 Machine Translation (MT)
  • πŸ”„ Speech Translation (ST) β€” both cascaded and end-to-end seq2seq models

It includes:

  • βš™οΈ Model preparation & fine-tuning
  • πŸš€ Training & inference scripts
  • πŸ“Š Evaluation & benchmarking utilities

πŸ“¦ Dataset & Models

You can explore and download all fine-tuned models for MultiMed-ST directly from our Hugging Face repository:

πŸ”Ή Whisper ASR Fine-tuned Models (Click to expand)
πŸ”Ή LLaMA-based MT Fine-tuned Models (Click to expand)
Source β†’ Target Model Link
Chinese β†’ English llama_Chinese_English
Chinese β†’ French llama_Chinese_French
Chinese β†’ German llama_Chinese_German
Chinese β†’ Vietnamese llama_Chinese_Vietnamese
English β†’ Chinese llama_English_Chinese
English β†’ French llama_English_French
English β†’ German llama_English_German
English β†’ Vietnamese llama_English_Vietnamese
French β†’ Chinese llama_French_Chinese
French β†’ English llama_French_English
French β†’ German llama_French_German
French β†’ Vietnamese llama_French_Vietnamese
German β†’ Chinese llama_German_Chinese
German β†’ English llama_German_English
German β†’ French llama_German_French
German β†’ Vietnamese llama_German_Vietnamese
Vietnamese β†’ Chinese llama_Vietnamese_Chinese
Vietnamese β†’ English llama_Vietnamese_English
Vietnamese β†’ French llama_Vietnamese_French
Vietnamese β†’ German llama_Vietnamese_German
πŸ”Ή m2m100_418M MT Fine-tuned Models (Click to expand)

πŸ‘¨β€πŸ’» Core Developers

  1. Khai Le-Duc

University of Toronto, Canada

πŸ“§ duckhai.le@mail.utoronto.ca
πŸ”— https://github.com/leduckhai

  1. Tuyen Tran: πŸ“§ tuyencbt@gmail.com

Hanoi University of Science and Technology, Vietnam

  1. Nguyen Kim Hai Bui: πŸ“§ htlulem185@gmail.com

EΓΆtvΓΆs LorΓ‘nd University, Hungary

🧾 Citation

If you use our dataset or models, please cite:

πŸ“„ arXiv:2504.03546

@inproceedings{le2025multimedst,
  title={MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation},
  author={Le-Duc, Khai and Tran, Tuyen and Tat, Bach Phan and Bui, Nguyen Kim Hai and Anh, Quan Dang and Tran, Hung-Phong and Nguyen, Thanh Thuy and Nguyen, Ly and Phan, Tuan Minh and Tran, Thi Thu Phuong and others},
  booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  pages={11838--11963},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for leduckhai/MultiMed-ST

Finetuned
(124)
this model

Datasets used to train leduckhai/MultiMed-ST

Collection including leduckhai/MultiMed-ST