eolang
/

whisper-turbo-hqq-quantized

Automatic Speech Recognition

Model card Files Files and versions

whisper-turbo-hqq-quantized / README.md

eolang's picture

Add HQQ 4-bit quantized Whisper model

806f4a7 verified 2 days ago

|

history blame contribute delete

1.23 kB

	---
	library_name: transformers
	pipeline_tag: automatic-speech-recognition
	tags:
	- whisper
	- hqq
	- quantized
	- 4bit
	license: apache-2.0
	---

	# HQQ 4-bit Quantized Whisper Model

	This is a 4-bit HQQ quantized version of eolang/whisperturbo.

	## Model Details
	- Base Model: eolang/whisperturbo
	- Quantization: HQQ 4-bit, group_size=64
	- Compression: ~4x reduction in size
	- Library: HQQ (Half-Quadratic Quantization)

	## Usage

	```python
	import torch
	from transformers import WhisperProcessor
	from hqq.models.hf.base import AutoHQQHFModel
	import librosa

	# Load quantized model
	model = AutoHQQHFModel.from_quantized("eolang/whisper-turbo-hqq-quantized")
	processor = WhisperProcessor.from_pretrained("eolang/whisper-turbo-hqq-quantized")

	# Load and process audio
	audio, sr = librosa.load("audio.wav", sr=16000)
	inputs = processor(audio, sampling_rate=16000, return_tensors="pt")

	# Generate transcription
	with torch.no_grad():
	predicted_ids = model.generate(inputs["input_features"])
	transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
	print(transcription[0])
	```

	## Requirements
	- pip install git+https://github.com/mobiusml/hqq.git
	- pip install transformers librosa soundfile