Upload fine-tuned OWSM model with exp/finetune directory

9689953 verified 5 months ago

828 Bytes

	# Common Accent ASR Model
	This is a fine-tuned ASR model based on [espnet/owsm_v3.1_ebf_base](https://huggingface.co/espnet/owsm_v3.1_ebf_base) trained on the [DTU54DL/common-accent](https://huggingface.co/datasets/DTU54DL/common-accent) dataset.

	## Model details
	- Base model: espnet/owsm_v3.1_ebf_base
	- Language: English
	- Task: Automatic Speech Recognition

	## Usage
	```python
	import torch
	import numpy as np
	from espnet2.bin.s2t_inference import Speech2Text

	# Load the model
	model = Speech2Text.from_pretrained(
	"reecursion/accent-adaptive-owsm_v3.1_ebf_base",
	lang_sym="<eng>",
	beam_size=1,
	device="cuda" if torch.cuda.is_available() else "cpu"
	)

	# Example inference
	waveform = ... # Load your audio as numpy array
	transcription = model(waveform)
	print(transcription[0][0]) # Print the transcription
	```