File size: 828 Bytes
3bcc00d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9689953
3bcc00d
9689953
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Common Accent ASR Model
This is a fine-tuned ASR model based on [espnet/owsm_v3.1_ebf_base](https://huggingface.co/espnet/owsm_v3.1_ebf_base) trained on the [DTU54DL/common-accent](https://huggingface.co/datasets/DTU54DL/common-accent) dataset.

## Model details
- Base model: espnet/owsm_v3.1_ebf_base
- Language: English
- Task: Automatic Speech Recognition

## Usage
```python
import torch
import numpy as np
from espnet2.bin.s2t_inference import Speech2Text

# Load the model
model = Speech2Text.from_pretrained(
    "reecursion/accent-adaptive-owsm_v3.1_ebf_base",
    lang_sym="<eng>",
    beam_size=1,
    device="cuda" if torch.cuda.is_available() else "cpu"
)

# Example inference
waveform = ... # Load your audio as numpy array
transcription = model(waveform)
print(transcription[0][0]) # Print the transcription
```