csikasote
/

mms-1b-toigen-combined-model

Automatic Speech Recognition

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

mms-1b-toigen-combined-model / README.md

csikasote's picture

End of training

2be848f verified 8 months ago

|

history blame contribute delete

3.39 kB

	---
	library_name: transformers
	license: cc-by-nc-4.0
	base_model: facebook/mms-1b-all
	tags:
	- automatic-speech-recognition
	- toigen
	- mms
	- generated_from_trainer
	metrics:
	- wer
	model-index:
	- name: mms-1b-toigen-combined-model
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mms-1b-toigen-combined-model

	This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the TOIGEN - TOI dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3149
	- Wer: 0.3760

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0003
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 8
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 100
	- num_epochs: 30.0
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:-------:\|:----:\|:---------------:\|:------:\|
	\| 15.204 \| 0.4474 \| 100 \| 3.5867 \| 1.0672 \|
	\| 4.2355 \| 0.8949 \| 200 \| 0.5745 \| 0.5648 \|
	\| 1.4309 \| 1.3400 \| 300 \| 0.4451 \| 0.5084 \|
	\| 1.1797 \| 1.7875 \| 400 \| 0.4035 \| 0.4828 \|
	\| 1.1218 \| 2.2327 \| 500 \| 0.3912 \| 0.4663 \|
	\| 1.0287 \| 2.6801 \| 600 \| 0.3838 \| 0.4552 \|
	\| 0.9773 \| 3.1253 \| 700 \| 0.3751 \| 0.4481 \|
	\| 1.038 \| 3.5727 \| 800 \| 0.3665 \| 0.4421 \|
	\| 0.9878 \| 4.0179 \| 900 \| 0.3571 \| 0.4356 \|
	\| 0.9888 \| 4.4653 \| 1000 \| 0.3510 \| 0.4359 \|
	\| 0.8904 \| 4.9128 \| 1100 \| 0.3498 \| 0.4172 \|
	\| 0.8178 \| 5.3579 \| 1200 \| 0.3456 \| 0.4152 \|
	\| 0.9608 \| 5.8054 \| 1300 \| 0.3384 \| 0.4184 \|
	\| 0.9166 \| 6.2506 \| 1400 \| 0.3416 \| 0.4099 \|
	\| 0.8623 \| 6.6980 \| 1500 \| 0.3351 \| 0.4034 \|
	\| 0.823 \| 7.1432 \| 1600 \| 0.3306 \| 0.3977 \|
	\| 0.8495 \| 7.5906 \| 1700 \| 0.3321 \| 0.3937 \|
	\| 0.8691 \| 8.0358 \| 1800 \| 0.3244 \| 0.3986 \|
	\| 0.8225 \| 8.4832 \| 1900 \| 0.3261 \| 0.3956 \|
	\| 0.8193 \| 8.9306 \| 2000 \| 0.3224 \| 0.3921 \|
	\| 0.79 \| 9.3758 \| 2100 \| 0.3181 \| 0.3884 \|
	\| 0.8035 \| 9.8233 \| 2200 \| 0.3272 \| 0.3887 \|
	\| 0.8391 \| 10.2685 \| 2300 \| 0.3177 \| 0.3894 \|
	\| 0.8055 \| 10.7159 \| 2400 \| 0.3255 \| 0.3790 \|
	\| 0.7124 \| 11.1611 \| 2500 \| 0.3137 \| 0.3912 \|
	\| 0.7747 \| 11.6085 \| 2600 \| 0.3264 \| 0.3850 \|
	\| 0.795 \| 12.0537 \| 2700 \| 0.3150 \| 0.3852 \|
	\| 0.7749 \| 12.5011 \| 2800 \| 0.3177 \| 0.3806 \|
	\| 0.7364 \| 12.9485 \| 2900 \| 0.3150 \| 0.3762 \|


	### Framework versions

	- Transformers 4.47.1
	- Pytorch 2.5.1+cu124
	- Datasets 3.2.0
	- Tokenizers 0.21.0