Spaces:
Running
Running
title: README | |
emoji: π | |
colorFrom: gray | |
colorTo: yellow | |
sdk: static | |
pinned: false | |
license: other | |
ivrit.ai is an effort to provide high-quality Hebrew datasets under a permissive license. | |
It is our hope that such datasets will be used to enable first-class support for Hebrew in AI models. | |
More about us can be found at ivrit.ai. | |
We are proud to present our latest achievements: | |
1) A state-of-the-art Hebrew speech-to-text model: https://huggingface.co/ivrit-ai/whisper-large-v3-ct2 | |
2) A turbo-based Hebrew speech-to-text model: https://huggingface.co/ivrit-ai/whisper-large-v3-turbo-ct2 | |
3) Our newest comprehensive Hebrew language dataset: https://huggingface.co/datasets/ivrit-ai/crowd-transcribe-v5 | |
Paper: https://www.isca-archive.org/interspeech_2025/marmor25_interspeech.pdf | |
If you use our datasets or models, the following quote is preferable: | |
``` | |
@inproceedings{marmor2025building, | |
title={Building an Accurate Open-Source Hebrew ASR System through Crowdsourcing}, | |
author={Marmor, Yanir and Lifshitz, Yair and Snapir, Yoad and Misgav, Kinneret}, | |
booktitle={Proc. Interspeech 2025}, | |
pages={723--727}, | |
year={2025} | |
} | |
``` |