openvino>=2024.4.0 huggingface_hub torch>=2.1 gradio>=4.19 librosa==0.9.2 opencv-contrib-python opencv-python IPython tqdm numba numpy openai-whisper sounddevice scipy transformers>=4.35 torchvision>=0.18.1 onnx>=1.16.1 optimum-intel @ git+https://github.com/huggingface/optimum-intel.git openvino openvino-tokenizers openvino-genai datasets soundfile>=0.12 python-ffmpeg<=1.0.16 nncf>=2.13.0 jiwer gtts