w2vbert-ctc-salt

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 0.1
training_steps: 15000

Training Loss	Epoch	Step	Cer	Validation Loss	Wer
5.9489	0.2076	1500	1.0	3.0230	1.0
1.5319	0.4152	3000	0.5960	0.5589	0.1293
1.1602	0.6228	4500	0.4309	0.4809	0.1054
1.0148	0.8304	6000	0.3715	0.4499	0.0974
0.9507	1.0381	7500	0.3443	0.4274	0.0927
0.9469	1.2457	9000	0.3220	0.4031	0.0876
0.8564	1.4533	10500	0.3134	0.3995	0.0864
0.8318	1.6609	12000	0.3061	0.3951	0.0848
0.8707	1.8685	13500	0.3033	0.3904	0.0841
0.9274	2.0761	15000	0.3020	0.3905	0.0840

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(446)

this model