2and3_apps_3k_v5

This model is a fine-tuned version of Qwen/Qwen2.5-7B on the 2and3_apps_3k_v5 dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 4
total_eval_batch_size: 4
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
num_epochs: 1

Training Loss	Epoch	Step	Validation Loss
0.268	0.0342	100	0.2557
0.2918	0.0684	200	0.2401
0.2227	0.1027	300	0.2289
0.2936	0.1369	400	0.2240
0.2276	0.1711	500	0.2259
0.1976	0.2053	600	0.2234
0.1832	0.2396	700	0.2186
0.2043	0.2738	800	0.2150
0.2478	0.3080	900	0.2154
0.2061	0.3422	1000	0.2004
0.1902	0.3765	1100	0.2045
0.1922	0.4107	1200	0.2022
0.1925	0.4449	1300	0.2010
0.2042	0.4791	1400	0.2031
0.1845	0.5133	1500	0.2018
0.1964	0.5476	1600	0.2000
0.2325	0.5818	1700	0.1962
0.1876	0.6160	1800	0.1934
0.1937	0.6502	1900	0.1915
0.1982	0.6845	2000	0.1898
0.1876	0.7187	2100	0.1871
0.213	0.7529	2200	0.1860
0.2071	0.7871	2300	0.1862
0.1761	0.8214	2400	0.1861
0.1777	0.8556	2500	0.1858
0.1489	0.8898	2600	0.1850
0.1645	0.9240	2700	0.1845
0.1845	0.9582	2800	0.1842
0.2179	0.9925	2900	0.1845

Safetensors

Model size

8B params

Tensor type

BF16

Base model

Qwen/Qwen2.5-7B

Finetuned

(749)

this model