File size: 1,133 Bytes
f4ba066
 
 
 
25f52b2
7478a24
 
 
f4ba066
 
7478a24
f4ba066
7478a24
f4ba066
 
 
 
7478a24
f4ba066
 
 
 
 
 
7478a24
f4ba066
bb52c0c
f4ba066
bb52c0c
f4ba066
bb52c0c
f4ba066
 
bb52c0c
f4ba066
bb52c0c
f4ba066
bb52c0c
db2152c
7478a24
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
license: apache-2.0
base_model:
- openai/whisper-base
pipeline_tag: automatic-speech-recognition
language:
- en
- ru
---

OpenAI Whisper base [model](https://huggingface.co/openai/whisper-base) converted to ONNX format for [onnx-asr](https://github.com/istupakov/onnx-asr).

Install onnx-asr
```shell
pip install onnx-asr[cpu,hub]
```

Load whisper-base model and recognize wav file
```py
import onnx_asr
model = onnx_asr.load_model("whisper-base")
print(model.recognize("test.wav"))
```

## Model export

Read onnxruntime [instruction](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/tools/transformers/models/whisper/README.md) for convert Whisper to ONNX.

Download model and export with *Beam Search* and *Forced Decoder Input Ids*:
```shell
python3 -m onnxruntime.transformers.models.whisper.convert_to_onnx -m openai/whisper-base --output ./whisper-onnx --use_forced_decoder_ids --optimize_onnx --precision fp32
```

Save tokenizer config
```py
from transformers import WhisperTokenizer

processor = WhisperTokenizer.from_pretrained("openai/whisper-base")
processor.save_pretrained("whisper-onnx")
```