|
2025-08-29 22:50:26 - pico-train - INFO - Step 20000 -- ๐ Evaluation Results |
|
2025-08-29 22:50:26 - pico-train - INFO - โโโ paloma: 1.8399778163273925e+24 |
|
2025-08-29 22:50:26 - pico-train - INFO - ================================================== |
|
2025-08-29 22:50:26 - pico-train - INFO - โจ Training Configuration |
|
2025-08-29 22:50:26 - pico-train - INFO - ================================================== |
|
2025-08-29 22:50:26 - pico-train - INFO - โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ checkpointing: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ checkpoints_dir: checkpoints โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ evaluation: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ eval_results_dir: eval_results โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ fabric_checkpoint_dir: fabric_state โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ fabric_checkpoint_filename: checkpoint.pt โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ hf_checkpoint: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ collection_slug: null โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ repo_id: ThomasTheMaker/pico-decoder-tiny โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ learning_dynamics: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ batch_size: 1 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ eval_data: null โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ layer_suffixes: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ - attention.v_proj โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ - attention.o_proj โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ - swiglu.w_2 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ sequence_idx: -1 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ learning_dynamics_dir: learning_dynamics โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ logs_dir: logs โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ run_name: pico-decoder-tiny-dolma5M-v1 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ runs_dir: runs โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ save_every_n_steps: 500 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ save_to_hf: true โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ training: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ auto_resume: true โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ data: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ dataloader: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ batch_size: 4 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ dataset: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ name: ThomasTheMaker/pretokenized-dolma-5M โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ tokenizer: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ name: allenai/OLMo-7B-0724-hf โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ vocab_size: 50304 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ evaluation: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ metrics: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ - paloma โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ paloma: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ batch_size: 1 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ dataset_name: pico-lm/pretokenized-paloma-tinsy โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ dataset_split: val โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ max_length: 2048 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ model: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ activation_hidden_dim: 384 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ attention_n_heads: 12 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ attention_n_kv_heads: 4 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ batch_size: 1024 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ d_model: 96 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ max_seq_len: 2048 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ model_type: pico_decoder โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ n_layers: 12 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ norm_eps: 1.0e-06 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ position_emb_theta: 10000.0 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ vocab_size: 50304 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ monitoring: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ logging: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ log_every_n_steps: 25 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ log_level: INFO โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ save_to_wandb: false โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ wandb: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ entity: boymyc โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ project: pico-decoder-tiny โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ training: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ fabric: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ accelerator: cuda โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ num_devices: 1 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ num_nodes: 1 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ precision: bf16-mixed โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ max_steps: 20000 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ optimization: โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ gradient_accumulation_steps: 4 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ lr: 5.0e-05 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ lr_scheduler: cosine โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ lr_warmup_steps: 8000 โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ optimizer: adamw โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โ โ |
|
2025-08-29 22:50:26 - pico-train - INFO - โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ |
|
2025-08-29 22:50:26 - pico-train - INFO - ================================================== |
|
2025-08-29 22:50:26 - pico-train - INFO - โญ Runtime Summary: |
|
2025-08-29 22:50:26 - pico-train - INFO - ================================================== |
|
2025-08-29 22:50:26 - pico-train - INFO - Starting from step: 20000 |
|
2025-08-29 22:50:26 - pico-train - INFO - Model Setup: |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Total Parameters: 11,282,784 |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Trainable Parameters: 11,282,784 |
|
2025-08-29 22:50:26 - pico-train - INFO - Distributed Setup: |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Number of Devices: 1 |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Device Type: NVIDIA GeForce RTX 5090 |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Available Memory: 33.68 GB |
|
2025-08-29 22:50:26 - pico-train - INFO - Software Setup: |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Python Version: 3.10.12 |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ PyTorch Version: 2.8.0+cu128 |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ CUDA Version: 12.8 |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Operating System: Linux 6.8.0-63-generic |
|
2025-08-29 22:50:26 - pico-train - INFO - Batch Size Configuration: |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Global Batch Size: 4 |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Per Device Batch Size: 1 |
|
2025-08-29 22:50:26 - pico-train - INFO - โโ Gradient Accumulation Steps: 4 |
|
2025-08-29 22:50:26 - pico-train - INFO - ================================================== |
|
2025-08-29 22:50:27 - pico-train - INFO - Step 20000 -- ๐ Training Metrics |
|
2025-08-29 22:50:27 - pico-train - INFO - โโโ Loss: 6.5103 |
|
2025-08-29 22:50:27 - pico-train - INFO - โโโ Learning Rate: 5.00e-06 |
|
2025-08-29 22:50:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:50:27 - pico-train - INFO - Step 20000 -- ๐ Saving Learning Dynamics |
|
2025-08-29 22:50:43 - pico-train - INFO - Step 20025 -- ๐ Training Metrics |
|
2025-08-29 22:50:43 - pico-train - INFO - โโโ Loss: 6.4274 |
|
2025-08-29 22:50:43 - pico-train - INFO - โโโ Learning Rate: 3.45e-05 |
|
2025-08-29 22:50:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:50:55 - pico-train - INFO - Step 20050 -- ๐ Training Metrics |
|
2025-08-29 22:50:55 - pico-train - INFO - โโโ Loss: 6.3770 |
|
2025-08-29 22:50:55 - pico-train - INFO - โโโ Learning Rate: 3.45e-05 |
|
2025-08-29 22:50:55 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:51:08 - pico-train - INFO - Step 20075 -- ๐ Training Metrics |
|
2025-08-29 22:51:08 - pico-train - INFO - โโโ Loss: 6.2797 |
|
2025-08-29 22:51:08 - pico-train - INFO - โโโ Learning Rate: 3.44e-05 |
|
2025-08-29 22:51:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:51:21 - pico-train - INFO - Step 20100 -- ๐ Training Metrics |
|
2025-08-29 22:51:21 - pico-train - INFO - โโโ Loss: 6.3924 |
|
2025-08-29 22:51:21 - pico-train - INFO - โโโ Learning Rate: 3.43e-05 |
|
2025-08-29 22:51:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:51:34 - pico-train - INFO - Step 20125 -- ๐ Training Metrics |
|
2025-08-29 22:51:34 - pico-train - INFO - โโโ Loss: 6.4442 |
|
2025-08-29 22:51:34 - pico-train - INFO - โโโ Learning Rate: 3.43e-05 |
|
2025-08-29 22:51:34 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:51:47 - pico-train - INFO - Step 20150 -- ๐ Training Metrics |
|
2025-08-29 22:51:47 - pico-train - INFO - โโโ Loss: 6.3881 |
|
2025-08-29 22:51:47 - pico-train - INFO - โโโ Learning Rate: 3.42e-05 |
|
2025-08-29 22:51:47 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:52:00 - pico-train - INFO - Step 20175 -- ๐ Training Metrics |
|
2025-08-29 22:52:00 - pico-train - INFO - โโโ Loss: 6.4008 |
|
2025-08-29 22:52:00 - pico-train - INFO - โโโ Learning Rate: 3.42e-05 |
|
2025-08-29 22:52:00 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:52:12 - pico-train - INFO - Step 20200 -- ๐ Training Metrics |
|
2025-08-29 22:52:12 - pico-train - INFO - โโโ Loss: 6.4257 |
|
2025-08-29 22:52:12 - pico-train - INFO - โโโ Learning Rate: 3.41e-05 |
|
2025-08-29 22:52:12 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:52:25 - pico-train - INFO - Step 20225 -- ๐ Training Metrics |
|
2025-08-29 22:52:25 - pico-train - INFO - โโโ Loss: 6.4125 |
|
2025-08-29 22:52:25 - pico-train - INFO - โโโ Learning Rate: 3.41e-05 |
|
2025-08-29 22:52:25 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:52:38 - pico-train - INFO - Step 20250 -- ๐ Training Metrics |
|
2025-08-29 22:52:38 - pico-train - INFO - โโโ Loss: 6.3390 |
|
2025-08-29 22:52:38 - pico-train - INFO - โโโ Learning Rate: 3.40e-05 |
|
2025-08-29 22:52:38 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:52:50 - pico-train - INFO - Step 20275 -- ๐ Training Metrics |
|
2025-08-29 22:52:50 - pico-train - INFO - โโโ Loss: 6.3328 |
|
2025-08-29 22:52:50 - pico-train - INFO - โโโ Learning Rate: 3.39e-05 |
|
2025-08-29 22:52:50 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:53:03 - pico-train - INFO - Step 20300 -- ๐ Training Metrics |
|
2025-08-29 22:53:03 - pico-train - INFO - โโโ Loss: 6.3035 |
|
2025-08-29 22:53:03 - pico-train - INFO - โโโ Learning Rate: 3.39e-05 |
|
2025-08-29 22:53:03 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:53:16 - pico-train - INFO - Step 20325 -- ๐ Training Metrics |
|
2025-08-29 22:53:16 - pico-train - INFO - โโโ Loss: 6.2862 |
|
2025-08-29 22:53:16 - pico-train - INFO - โโโ Learning Rate: 3.38e-05 |
|
2025-08-29 22:53:16 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:53:28 - pico-train - INFO - Step 20350 -- ๐ Training Metrics |
|
2025-08-29 22:53:28 - pico-train - INFO - โโโ Loss: 6.4249 |
|
2025-08-29 22:53:28 - pico-train - INFO - โโโ Learning Rate: 3.38e-05 |
|
2025-08-29 22:53:28 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:53:41 - pico-train - INFO - Step 20375 -- ๐ Training Metrics |
|
2025-08-29 22:53:41 - pico-train - INFO - โโโ Loss: 6.3582 |
|
2025-08-29 22:53:41 - pico-train - INFO - โโโ Learning Rate: 3.37e-05 |
|
2025-08-29 22:53:41 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:53:54 - pico-train - INFO - Step 20400 -- ๐ Training Metrics |
|
2025-08-29 22:53:54 - pico-train - INFO - โโโ Loss: 6.3195 |
|
2025-08-29 22:53:54 - pico-train - INFO - โโโ Learning Rate: 3.37e-05 |
|
2025-08-29 22:53:54 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:54:07 - pico-train - INFO - Step 20425 -- ๐ Training Metrics |
|
2025-08-29 22:54:07 - pico-train - INFO - โโโ Loss: 6.4802 |
|
2025-08-29 22:54:07 - pico-train - INFO - โโโ Learning Rate: 3.36e-05 |
|
2025-08-29 22:54:07 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:54:22 - pico-train - INFO - Step 20450 -- ๐ Training Metrics |
|
2025-08-29 22:54:22 - pico-train - INFO - โโโ Loss: 6.3126 |
|
2025-08-29 22:54:22 - pico-train - INFO - โโโ Learning Rate: 3.35e-05 |
|
2025-08-29 22:54:22 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:54:35 - pico-train - INFO - Step 20475 -- ๐ Training Metrics |
|
2025-08-29 22:54:35 - pico-train - INFO - โโโ Loss: 6.4323 |
|
2025-08-29 22:54:35 - pico-train - INFO - โโโ Learning Rate: 3.35e-05 |
|
2025-08-29 22:54:35 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:54:50 - pico-train - INFO - Step 20500 -- ๐พ Saving Checkpoint |
|
2025-08-29 22:59:37 - pico-train - INFO - Step 20500 -- ๐ Evaluation Results |
|
2025-08-29 22:59:37 - pico-train - INFO - โโโ paloma: 4.281028602870165e+24 |
|
2025-08-29 22:59:42 - pico-train - INFO - Step 20500 -- ๐ Training Metrics |
|
2025-08-29 22:59:42 - pico-train - INFO - โโโ Loss: 6.4138 |
|
2025-08-29 22:59:42 - pico-train - INFO - โโโ Learning Rate: 3.34e-05 |
|
2025-08-29 22:59:42 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 22:59:42 - pico-train - INFO - Step 20500 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:00:21 - pico-train - INFO - Step 20525 -- ๐ Training Metrics |
|
2025-08-29 23:00:21 - pico-train - INFO - โโโ Loss: 6.3971 |
|
2025-08-29 23:00:21 - pico-train - INFO - โโโ Learning Rate: 3.34e-05 |
|
2025-08-29 23:00:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:00:53 - pico-train - INFO - Step 20550 -- ๐ Training Metrics |
|
2025-08-29 23:00:53 - pico-train - INFO - โโโ Loss: 6.3632 |
|
2025-08-29 23:00:53 - pico-train - INFO - โโโ Learning Rate: 3.33e-05 |
|
2025-08-29 23:00:53 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:01:27 - pico-train - INFO - Step 20575 -- ๐ Training Metrics |
|
2025-08-29 23:01:27 - pico-train - INFO - โโโ Loss: 6.4202 |
|
2025-08-29 23:01:27 - pico-train - INFO - โโโ Learning Rate: 3.32e-05 |
|
2025-08-29 23:01:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:02:01 - pico-train - INFO - Step 20600 -- ๐ Training Metrics |
|
2025-08-29 23:02:01 - pico-train - INFO - โโโ Loss: 6.4792 |
|
2025-08-29 23:02:01 - pico-train - INFO - โโโ Learning Rate: 3.32e-05 |
|
2025-08-29 23:02:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:02:34 - pico-train - INFO - Step 20625 -- ๐ Training Metrics |
|
2025-08-29 23:02:34 - pico-train - INFO - โโโ Loss: 6.3213 |
|
2025-08-29 23:02:34 - pico-train - INFO - โโโ Learning Rate: 3.31e-05 |
|
2025-08-29 23:02:34 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:03:09 - pico-train - INFO - Step 20650 -- ๐ Training Metrics |
|
2025-08-29 23:03:09 - pico-train - INFO - โโโ Loss: 6.4173 |
|
2025-08-29 23:03:09 - pico-train - INFO - โโโ Learning Rate: 3.31e-05 |
|
2025-08-29 23:03:09 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:03:43 - pico-train - INFO - Step 20675 -- ๐ Training Metrics |
|
2025-08-29 23:03:43 - pico-train - INFO - โโโ Loss: 6.4062 |
|
2025-08-29 23:03:43 - pico-train - INFO - โโโ Learning Rate: 3.30e-05 |
|
2025-08-29 23:03:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:04:19 - pico-train - INFO - Step 20700 -- ๐ Training Metrics |
|
2025-08-29 23:04:19 - pico-train - INFO - โโโ Loss: 6.3742 |
|
2025-08-29 23:04:19 - pico-train - INFO - โโโ Learning Rate: 3.30e-05 |
|
2025-08-29 23:04:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:04:56 - pico-train - INFO - Step 20725 -- ๐ Training Metrics |
|
2025-08-29 23:04:56 - pico-train - INFO - โโโ Loss: 6.3820 |
|
2025-08-29 23:04:56 - pico-train - INFO - โโโ Learning Rate: 3.29e-05 |
|
2025-08-29 23:04:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:05:17 - pico-train - INFO - Step 20750 -- ๐ Training Metrics |
|
2025-08-29 23:05:17 - pico-train - INFO - โโโ Loss: 6.3374 |
|
2025-08-29 23:05:17 - pico-train - INFO - โโโ Learning Rate: 3.28e-05 |
|
2025-08-29 23:05:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:05:30 - pico-train - INFO - Step 20775 -- ๐ Training Metrics |
|
2025-08-29 23:05:30 - pico-train - INFO - โโโ Loss: 6.4028 |
|
2025-08-29 23:05:30 - pico-train - INFO - โโโ Learning Rate: 3.28e-05 |
|
2025-08-29 23:05:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:05:43 - pico-train - INFO - Step 20800 -- ๐ Training Metrics |
|
2025-08-29 23:05:43 - pico-train - INFO - โโโ Loss: 6.3732 |
|
2025-08-29 23:05:43 - pico-train - INFO - โโโ Learning Rate: 3.27e-05 |
|
2025-08-29 23:05:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:05:55 - pico-train - INFO - Step 20825 -- ๐ Training Metrics |
|
2025-08-29 23:05:55 - pico-train - INFO - โโโ Loss: 6.3486 |
|
2025-08-29 23:05:55 - pico-train - INFO - โโโ Learning Rate: 3.27e-05 |
|
2025-08-29 23:05:55 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:06:08 - pico-train - INFO - Step 20850 -- ๐ Training Metrics |
|
2025-08-29 23:06:08 - pico-train - INFO - โโโ Loss: 6.3611 |
|
2025-08-29 23:06:08 - pico-train - INFO - โโโ Learning Rate: 3.26e-05 |
|
2025-08-29 23:06:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:06:21 - pico-train - INFO - Step 20875 -- ๐ Training Metrics |
|
2025-08-29 23:06:21 - pico-train - INFO - โโโ Loss: 6.3278 |
|
2025-08-29 23:06:21 - pico-train - INFO - โโโ Learning Rate: 3.26e-05 |
|
2025-08-29 23:06:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:06:33 - pico-train - INFO - Step 20900 -- ๐ Training Metrics |
|
2025-08-29 23:06:33 - pico-train - INFO - โโโ Loss: 6.3287 |
|
2025-08-29 23:06:33 - pico-train - INFO - โโโ Learning Rate: 3.25e-05 |
|
2025-08-29 23:06:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:06:46 - pico-train - INFO - Step 20925 -- ๐ Training Metrics |
|
2025-08-29 23:06:46 - pico-train - INFO - โโโ Loss: 6.3276 |
|
2025-08-29 23:06:46 - pico-train - INFO - โโโ Learning Rate: 3.24e-05 |
|
2025-08-29 23:06:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:06:58 - pico-train - INFO - Step 20950 -- ๐ Training Metrics |
|
2025-08-29 23:06:58 - pico-train - INFO - โโโ Loss: 6.4450 |
|
2025-08-29 23:06:58 - pico-train - INFO - โโโ Learning Rate: 3.24e-05 |
|
2025-08-29 23:06:58 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:07:11 - pico-train - INFO - Step 20975 -- ๐ Training Metrics |
|
2025-08-29 23:07:11 - pico-train - INFO - โโโ Loss: 6.4429 |
|
2025-08-29 23:07:11 - pico-train - INFO - โโโ Learning Rate: 3.23e-05 |
|
2025-08-29 23:07:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:07:23 - pico-train - INFO - Step 21000 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:09:25 - pico-train - INFO - Step 21000 -- ๐ Evaluation Results |
|
2025-08-29 23:09:25 - pico-train - INFO - โโโ paloma: 3.816115022517074e+24 |
|
2025-08-29 23:09:28 - pico-train - INFO - Step 21000 -- ๐ Training Metrics |
|
2025-08-29 23:09:28 - pico-train - INFO - โโโ Loss: 6.2970 |
|
2025-08-29 23:09:28 - pico-train - INFO - โโโ Learning Rate: 3.23e-05 |
|
2025-08-29 23:09:28 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:09:28 - pico-train - INFO - Step 21000 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:09:43 - pico-train - INFO - Step 21025 -- ๐ Training Metrics |
|
2025-08-29 23:09:43 - pico-train - INFO - โโโ Loss: 6.3206 |
|
2025-08-29 23:09:43 - pico-train - INFO - โโโ Learning Rate: 3.22e-05 |
|
2025-08-29 23:09:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:09:56 - pico-train - INFO - Step 21050 -- ๐ Training Metrics |
|
2025-08-29 23:09:56 - pico-train - INFO - โโโ Loss: 6.3337 |
|
2025-08-29 23:09:56 - pico-train - INFO - โโโ Learning Rate: 3.21e-05 |
|
2025-08-29 23:09:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:10:08 - pico-train - INFO - Step 21075 -- ๐ Training Metrics |
|
2025-08-29 23:10:08 - pico-train - INFO - โโโ Loss: 6.3274 |
|
2025-08-29 23:10:08 - pico-train - INFO - โโโ Learning Rate: 3.21e-05 |
|
2025-08-29 23:10:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:10:21 - pico-train - INFO - Step 21100 -- ๐ Training Metrics |
|
2025-08-29 23:10:21 - pico-train - INFO - โโโ Loss: 6.4202 |
|
2025-08-29 23:10:21 - pico-train - INFO - โโโ Learning Rate: 3.20e-05 |
|
2025-08-29 23:10:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:10:33 - pico-train - INFO - Step 21125 -- ๐ Training Metrics |
|
2025-08-29 23:10:33 - pico-train - INFO - โโโ Loss: 6.3698 |
|
2025-08-29 23:10:33 - pico-train - INFO - โโโ Learning Rate: 3.20e-05 |
|
2025-08-29 23:10:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:10:46 - pico-train - INFO - Step 21150 -- ๐ Training Metrics |
|
2025-08-29 23:10:46 - pico-train - INFO - โโโ Loss: 6.2671 |
|
2025-08-29 23:10:46 - pico-train - INFO - โโโ Learning Rate: 3.19e-05 |
|
2025-08-29 23:10:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:10:59 - pico-train - INFO - Step 21175 -- ๐ Training Metrics |
|
2025-08-29 23:10:59 - pico-train - INFO - โโโ Loss: 6.4334 |
|
2025-08-29 23:10:59 - pico-train - INFO - โโโ Learning Rate: 3.18e-05 |
|
2025-08-29 23:10:59 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:11:11 - pico-train - INFO - Step 21200 -- ๐ Training Metrics |
|
2025-08-29 23:11:11 - pico-train - INFO - โโโ Loss: 6.4208 |
|
2025-08-29 23:11:11 - pico-train - INFO - โโโ Learning Rate: 3.18e-05 |
|
2025-08-29 23:11:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:11:24 - pico-train - INFO - Step 21225 -- ๐ Training Metrics |
|
2025-08-29 23:11:24 - pico-train - INFO - โโโ Loss: 6.3380 |
|
2025-08-29 23:11:24 - pico-train - INFO - โโโ Learning Rate: 3.17e-05 |
|
2025-08-29 23:11:24 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:11:37 - pico-train - INFO - Step 21250 -- ๐ Training Metrics |
|
2025-08-29 23:11:37 - pico-train - INFO - โโโ Loss: 6.3026 |
|
2025-08-29 23:11:37 - pico-train - INFO - โโโ Learning Rate: 3.17e-05 |
|
2025-08-29 23:11:37 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:11:49 - pico-train - INFO - Step 21275 -- ๐ Training Metrics |
|
2025-08-29 23:11:49 - pico-train - INFO - โโโ Loss: 6.3123 |
|
2025-08-29 23:11:49 - pico-train - INFO - โโโ Learning Rate: 3.16e-05 |
|
2025-08-29 23:11:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:12:02 - pico-train - INFO - Step 21300 -- ๐ Training Metrics |
|
2025-08-29 23:12:02 - pico-train - INFO - โโโ Loss: 6.2566 |
|
2025-08-29 23:12:02 - pico-train - INFO - โโโ Learning Rate: 3.15e-05 |
|
2025-08-29 23:12:02 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:12:15 - pico-train - INFO - Step 21325 -- ๐ Training Metrics |
|
2025-08-29 23:12:15 - pico-train - INFO - โโโ Loss: 6.2697 |
|
2025-08-29 23:12:15 - pico-train - INFO - โโโ Learning Rate: 3.15e-05 |
|
2025-08-29 23:12:15 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:12:27 - pico-train - INFO - Step 21350 -- ๐ Training Metrics |
|
2025-08-29 23:12:27 - pico-train - INFO - โโโ Loss: 6.2998 |
|
2025-08-29 23:12:27 - pico-train - INFO - โโโ Learning Rate: 3.14e-05 |
|
2025-08-29 23:12:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:12:40 - pico-train - INFO - Step 21375 -- ๐ Training Metrics |
|
2025-08-29 23:12:40 - pico-train - INFO - โโโ Loss: 6.3903 |
|
2025-08-29 23:12:40 - pico-train - INFO - โโโ Learning Rate: 3.14e-05 |
|
2025-08-29 23:12:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:12:52 - pico-train - INFO - Step 21400 -- ๐ Training Metrics |
|
2025-08-29 23:12:52 - pico-train - INFO - โโโ Loss: 6.2831 |
|
2025-08-29 23:12:52 - pico-train - INFO - โโโ Learning Rate: 3.13e-05 |
|
2025-08-29 23:12:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:13:05 - pico-train - INFO - Step 21425 -- ๐ Training Metrics |
|
2025-08-29 23:13:05 - pico-train - INFO - โโโ Loss: 6.3768 |
|
2025-08-29 23:13:05 - pico-train - INFO - โโโ Learning Rate: 3.13e-05 |
|
2025-08-29 23:13:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:13:18 - pico-train - INFO - Step 21450 -- ๐ Training Metrics |
|
2025-08-29 23:13:18 - pico-train - INFO - โโโ Loss: 6.3917 |
|
2025-08-29 23:13:18 - pico-train - INFO - โโโ Learning Rate: 3.12e-05 |
|
2025-08-29 23:13:18 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:13:30 - pico-train - INFO - Step 21475 -- ๐ Training Metrics |
|
2025-08-29 23:13:30 - pico-train - INFO - โโโ Loss: 6.3183 |
|
2025-08-29 23:13:30 - pico-train - INFO - โโโ Learning Rate: 3.11e-05 |
|
2025-08-29 23:13:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:13:43 - pico-train - INFO - Step 21500 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:15:44 - pico-train - INFO - Step 21500 -- ๐ Evaluation Results |
|
2025-08-29 23:15:44 - pico-train - INFO - โโโ paloma: 6.18596463935147e+24 |
|
2025-08-29 23:15:47 - pico-train - INFO - Step 21500 -- ๐ Training Metrics |
|
2025-08-29 23:15:47 - pico-train - INFO - โโโ Loss: 6.3327 |
|
2025-08-29 23:15:47 - pico-train - INFO - โโโ Learning Rate: 3.11e-05 |
|
2025-08-29 23:15:47 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:15:47 - pico-train - INFO - Step 21500 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:16:02 - pico-train - INFO - Step 21525 -- ๐ Training Metrics |
|
2025-08-29 23:16:02 - pico-train - INFO - โโโ Loss: 6.3111 |
|
2025-08-29 23:16:02 - pico-train - INFO - โโโ Learning Rate: 3.10e-05 |
|
2025-08-29 23:16:02 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:16:14 - pico-train - INFO - Step 21550 -- ๐ Training Metrics |
|
2025-08-29 23:16:14 - pico-train - INFO - โโโ Loss: 6.2823 |
|
2025-08-29 23:16:14 - pico-train - INFO - โโโ Learning Rate: 3.10e-05 |
|
2025-08-29 23:16:14 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:16:27 - pico-train - INFO - Step 21575 -- ๐ Training Metrics |
|
2025-08-29 23:16:27 - pico-train - INFO - โโโ Loss: 6.3073 |
|
2025-08-29 23:16:27 - pico-train - INFO - โโโ Learning Rate: 3.09e-05 |
|
2025-08-29 23:16:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:16:40 - pico-train - INFO - Step 21600 -- ๐ Training Metrics |
|
2025-08-29 23:16:40 - pico-train - INFO - โโโ Loss: 6.3168 |
|
2025-08-29 23:16:40 - pico-train - INFO - โโโ Learning Rate: 3.08e-05 |
|
2025-08-29 23:16:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:16:52 - pico-train - INFO - Step 21625 -- ๐ Training Metrics |
|
2025-08-29 23:16:52 - pico-train - INFO - โโโ Loss: 6.3106 |
|
2025-08-29 23:16:52 - pico-train - INFO - โโโ Learning Rate: 3.08e-05 |
|
2025-08-29 23:16:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:17:05 - pico-train - INFO - Step 21650 -- ๐ Training Metrics |
|
2025-08-29 23:17:05 - pico-train - INFO - โโโ Loss: 6.3128 |
|
2025-08-29 23:17:05 - pico-train - INFO - โโโ Learning Rate: 3.07e-05 |
|
2025-08-29 23:17:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:17:18 - pico-train - INFO - Step 21675 -- ๐ Training Metrics |
|
2025-08-29 23:17:18 - pico-train - INFO - โโโ Loss: 6.2762 |
|
2025-08-29 23:17:18 - pico-train - INFO - โโโ Learning Rate: 3.07e-05 |
|
2025-08-29 23:17:18 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:17:30 - pico-train - INFO - Step 21700 -- ๐ Training Metrics |
|
2025-08-29 23:17:30 - pico-train - INFO - โโโ Loss: 6.3577 |
|
2025-08-29 23:17:30 - pico-train - INFO - โโโ Learning Rate: 3.06e-05 |
|
2025-08-29 23:17:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:17:43 - pico-train - INFO - Step 21725 -- ๐ Training Metrics |
|
2025-08-29 23:17:43 - pico-train - INFO - โโโ Loss: 6.3495 |
|
2025-08-29 23:17:43 - pico-train - INFO - โโโ Learning Rate: 3.05e-05 |
|
2025-08-29 23:17:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:17:56 - pico-train - INFO - Step 21750 -- ๐ Training Metrics |
|
2025-08-29 23:17:56 - pico-train - INFO - โโโ Loss: 6.3331 |
|
2025-08-29 23:17:56 - pico-train - INFO - โโโ Learning Rate: 3.05e-05 |
|
2025-08-29 23:17:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:18:08 - pico-train - INFO - Step 21775 -- ๐ Training Metrics |
|
2025-08-29 23:18:08 - pico-train - INFO - โโโ Loss: 6.3146 |
|
2025-08-29 23:18:08 - pico-train - INFO - โโโ Learning Rate: 3.04e-05 |
|
2025-08-29 23:18:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:18:21 - pico-train - INFO - Step 21800 -- ๐ Training Metrics |
|
2025-08-29 23:18:21 - pico-train - INFO - โโโ Loss: 6.3567 |
|
2025-08-29 23:18:21 - pico-train - INFO - โโโ Learning Rate: 3.04e-05 |
|
2025-08-29 23:18:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:18:33 - pico-train - INFO - Step 21825 -- ๐ Training Metrics |
|
2025-08-29 23:18:33 - pico-train - INFO - โโโ Loss: 6.3185 |
|
2025-08-29 23:18:33 - pico-train - INFO - โโโ Learning Rate: 3.03e-05 |
|
2025-08-29 23:18:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:18:46 - pico-train - INFO - Step 21850 -- ๐ Training Metrics |
|
2025-08-29 23:18:46 - pico-train - INFO - โโโ Loss: 6.3087 |
|
2025-08-29 23:18:46 - pico-train - INFO - โโโ Learning Rate: 3.02e-05 |
|
2025-08-29 23:18:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:18:59 - pico-train - INFO - Step 21875 -- ๐ Training Metrics |
|
2025-08-29 23:18:59 - pico-train - INFO - โโโ Loss: 6.3817 |
|
2025-08-29 23:18:59 - pico-train - INFO - โโโ Learning Rate: 3.02e-05 |
|
2025-08-29 23:18:59 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:19:12 - pico-train - INFO - Step 21900 -- ๐ Training Metrics |
|
2025-08-29 23:19:12 - pico-train - INFO - โโโ Loss: 6.3398 |
|
2025-08-29 23:19:12 - pico-train - INFO - โโโ Learning Rate: 3.01e-05 |
|
2025-08-29 23:19:12 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:19:25 - pico-train - INFO - Step 21925 -- ๐ Training Metrics |
|
2025-08-29 23:19:25 - pico-train - INFO - โโโ Loss: 6.4012 |
|
2025-08-29 23:19:25 - pico-train - INFO - โโโ Learning Rate: 3.01e-05 |
|
2025-08-29 23:19:25 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:19:37 - pico-train - INFO - Step 21950 -- ๐ Training Metrics |
|
2025-08-29 23:19:37 - pico-train - INFO - โโโ Loss: 6.3352 |
|
2025-08-29 23:19:37 - pico-train - INFO - โโโ Learning Rate: 3.00e-05 |
|
2025-08-29 23:19:37 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:19:50 - pico-train - INFO - Step 21975 -- ๐ Training Metrics |
|
2025-08-29 23:19:50 - pico-train - INFO - โโโ Loss: 6.3857 |
|
2025-08-29 23:19:50 - pico-train - INFO - โโโ Learning Rate: 2.99e-05 |
|
2025-08-29 23:19:50 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:20:02 - pico-train - INFO - Step 22000 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:22:06 - pico-train - INFO - Step 22000 -- ๐ Evaluation Results |
|
2025-08-29 23:22:06 - pico-train - INFO - โโโ paloma: 7.840233924864941e+24 |
|
2025-08-29 23:22:08 - pico-train - INFO - Step 22000 -- ๐ Training Metrics |
|
2025-08-29 23:22:08 - pico-train - INFO - โโโ Loss: 6.3421 |
|
2025-08-29 23:22:08 - pico-train - INFO - โโโ Learning Rate: 2.99e-05 |
|
2025-08-29 23:22:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:22:08 - pico-train - INFO - Step 22000 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:22:24 - pico-train - INFO - Step 22025 -- ๐ Training Metrics |
|
2025-08-29 23:22:24 - pico-train - INFO - โโโ Loss: 6.4107 |
|
2025-08-29 23:22:24 - pico-train - INFO - โโโ Learning Rate: 2.98e-05 |
|
2025-08-29 23:22:24 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:22:36 - pico-train - INFO - Step 22050 -- ๐ Training Metrics |
|
2025-08-29 23:22:36 - pico-train - INFO - โโโ Loss: 6.3296 |
|
2025-08-29 23:22:36 - pico-train - INFO - โโโ Learning Rate: 2.98e-05 |
|
2025-08-29 23:22:36 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:22:49 - pico-train - INFO - Step 22075 -- ๐ Training Metrics |
|
2025-08-29 23:22:49 - pico-train - INFO - โโโ Loss: 6.2576 |
|
2025-08-29 23:22:49 - pico-train - INFO - โโโ Learning Rate: 2.97e-05 |
|
2025-08-29 23:22:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:23:01 - pico-train - INFO - Step 22100 -- ๐ Training Metrics |
|
2025-08-29 23:23:01 - pico-train - INFO - โโโ Loss: 6.2705 |
|
2025-08-29 23:23:01 - pico-train - INFO - โโโ Learning Rate: 2.96e-05 |
|
2025-08-29 23:23:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:23:14 - pico-train - INFO - Step 22125 -- ๐ Training Metrics |
|
2025-08-29 23:23:14 - pico-train - INFO - โโโ Loss: 6.2784 |
|
2025-08-29 23:23:14 - pico-train - INFO - โโโ Learning Rate: 2.96e-05 |
|
2025-08-29 23:23:14 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:23:27 - pico-train - INFO - Step 22150 -- ๐ Training Metrics |
|
2025-08-29 23:23:27 - pico-train - INFO - โโโ Loss: 6.3673 |
|
2025-08-29 23:23:27 - pico-train - INFO - โโโ Learning Rate: 2.95e-05 |
|
2025-08-29 23:23:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:23:39 - pico-train - INFO - Step 22175 -- ๐ Training Metrics |
|
2025-08-29 23:23:39 - pico-train - INFO - โโโ Loss: 6.3914 |
|
2025-08-29 23:23:39 - pico-train - INFO - โโโ Learning Rate: 2.95e-05 |
|
2025-08-29 23:23:39 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:23:52 - pico-train - INFO - Step 22200 -- ๐ Training Metrics |
|
2025-08-29 23:23:52 - pico-train - INFO - โโโ Loss: 6.3081 |
|
2025-08-29 23:23:52 - pico-train - INFO - โโโ Learning Rate: 2.94e-05 |
|
2025-08-29 23:23:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:24:05 - pico-train - INFO - Step 22225 -- ๐ Training Metrics |
|
2025-08-29 23:24:05 - pico-train - INFO - โโโ Loss: 6.4045 |
|
2025-08-29 23:24:05 - pico-train - INFO - โโโ Learning Rate: 2.93e-05 |
|
2025-08-29 23:24:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:24:17 - pico-train - INFO - Step 22250 -- ๐ Training Metrics |
|
2025-08-29 23:24:17 - pico-train - INFO - โโโ Loss: 6.3830 |
|
2025-08-29 23:24:17 - pico-train - INFO - โโโ Learning Rate: 2.93e-05 |
|
2025-08-29 23:24:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:24:30 - pico-train - INFO - Step 22275 -- ๐ Training Metrics |
|
2025-08-29 23:24:30 - pico-train - INFO - โโโ Loss: 6.2955 |
|
2025-08-29 23:24:30 - pico-train - INFO - โโโ Learning Rate: 2.92e-05 |
|
2025-08-29 23:24:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:24:43 - pico-train - INFO - Step 22300 -- ๐ Training Metrics |
|
2025-08-29 23:24:43 - pico-train - INFO - โโโ Loss: 6.3121 |
|
2025-08-29 23:24:43 - pico-train - INFO - โโโ Learning Rate: 2.92e-05 |
|
2025-08-29 23:24:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:24:56 - pico-train - INFO - Step 22325 -- ๐ Training Metrics |
|
2025-08-29 23:24:56 - pico-train - INFO - โโโ Loss: 6.3725 |
|
2025-08-29 23:24:56 - pico-train - INFO - โโโ Learning Rate: 2.91e-05 |
|
2025-08-29 23:24:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:25:08 - pico-train - INFO - Step 22350 -- ๐ Training Metrics |
|
2025-08-29 23:25:08 - pico-train - INFO - โโโ Loss: 6.3311 |
|
2025-08-29 23:25:08 - pico-train - INFO - โโโ Learning Rate: 2.90e-05 |
|
2025-08-29 23:25:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:25:21 - pico-train - INFO - Step 22375 -- ๐ Training Metrics |
|
2025-08-29 23:25:21 - pico-train - INFO - โโโ Loss: 6.2346 |
|
2025-08-29 23:25:21 - pico-train - INFO - โโโ Learning Rate: 2.90e-05 |
|
2025-08-29 23:25:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:25:33 - pico-train - INFO - Step 22400 -- ๐ Training Metrics |
|
2025-08-29 23:25:33 - pico-train - INFO - โโโ Loss: 6.3869 |
|
2025-08-29 23:25:33 - pico-train - INFO - โโโ Learning Rate: 2.89e-05 |
|
2025-08-29 23:25:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:25:46 - pico-train - INFO - Step 22425 -- ๐ Training Metrics |
|
2025-08-29 23:25:46 - pico-train - INFO - โโโ Loss: 6.3370 |
|
2025-08-29 23:25:46 - pico-train - INFO - โโโ Learning Rate: 2.89e-05 |
|
2025-08-29 23:25:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:25:59 - pico-train - INFO - Step 22450 -- ๐ Training Metrics |
|
2025-08-29 23:25:59 - pico-train - INFO - โโโ Loss: 6.3366 |
|
2025-08-29 23:25:59 - pico-train - INFO - โโโ Learning Rate: 2.88e-05 |
|
2025-08-29 23:25:59 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:26:11 - pico-train - INFO - Step 22475 -- ๐ Training Metrics |
|
2025-08-29 23:26:11 - pico-train - INFO - โโโ Loss: 6.3641 |
|
2025-08-29 23:26:11 - pico-train - INFO - โโโ Learning Rate: 2.87e-05 |
|
2025-08-29 23:26:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:26:23 - pico-train - INFO - Step 22500 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:28:22 - pico-train - INFO - Step 22500 -- ๐ Evaluation Results |
|
2025-08-29 23:28:22 - pico-train - INFO - โโโ paloma: 1.0171611158112828e+25 |
|
2025-08-29 23:28:23 - pico-train - INFO - Step 22500 -- ๐ Training Metrics |
|
2025-08-29 23:28:23 - pico-train - INFO - โโโ Loss: 6.2880 |
|
2025-08-29 23:28:23 - pico-train - INFO - โโโ Learning Rate: 2.87e-05 |
|
2025-08-29 23:28:23 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:28:23 - pico-train - INFO - Step 22500 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:28:39 - pico-train - INFO - Step 22525 -- ๐ Training Metrics |
|
2025-08-29 23:28:39 - pico-train - INFO - โโโ Loss: 6.2955 |
|
2025-08-29 23:28:39 - pico-train - INFO - โโโ Learning Rate: 2.86e-05 |
|
2025-08-29 23:28:39 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:28:51 - pico-train - INFO - Step 22550 -- ๐ Training Metrics |
|
2025-08-29 23:28:51 - pico-train - INFO - โโโ Loss: 6.3124 |
|
2025-08-29 23:28:51 - pico-train - INFO - โโโ Learning Rate: 2.85e-05 |
|
2025-08-29 23:28:51 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:29:04 - pico-train - INFO - Step 22575 -- ๐ Training Metrics |
|
2025-08-29 23:29:04 - pico-train - INFO - โโโ Loss: 6.3214 |
|
2025-08-29 23:29:04 - pico-train - INFO - โโโ Learning Rate: 2.85e-05 |
|
2025-08-29 23:29:04 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:29:17 - pico-train - INFO - Step 22600 -- ๐ Training Metrics |
|
2025-08-29 23:29:17 - pico-train - INFO - โโโ Loss: 6.2929 |
|
2025-08-29 23:29:17 - pico-train - INFO - โโโ Learning Rate: 2.84e-05 |
|
2025-08-29 23:29:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:29:29 - pico-train - INFO - Step 22625 -- ๐ Training Metrics |
|
2025-08-29 23:29:29 - pico-train - INFO - โโโ Loss: 6.3454 |
|
2025-08-29 23:29:29 - pico-train - INFO - โโโ Learning Rate: 2.84e-05 |
|
2025-08-29 23:29:29 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:29:42 - pico-train - INFO - Step 22650 -- ๐ Training Metrics |
|
2025-08-29 23:29:42 - pico-train - INFO - โโโ Loss: 6.2994 |
|
2025-08-29 23:29:42 - pico-train - INFO - โโโ Learning Rate: 2.83e-05 |
|
2025-08-29 23:29:42 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:29:55 - pico-train - INFO - Step 22675 -- ๐ Training Metrics |
|
2025-08-29 23:29:55 - pico-train - INFO - โโโ Loss: 6.3245 |
|
2025-08-29 23:29:55 - pico-train - INFO - โโโ Learning Rate: 2.82e-05 |
|
2025-08-29 23:29:55 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:30:07 - pico-train - INFO - Step 22700 -- ๐ Training Metrics |
|
2025-08-29 23:30:07 - pico-train - INFO - โโโ Loss: 6.1874 |
|
2025-08-29 23:30:07 - pico-train - INFO - โโโ Learning Rate: 2.82e-05 |
|
2025-08-29 23:30:07 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:30:20 - pico-train - INFO - Step 22725 -- ๐ Training Metrics |
|
2025-08-29 23:30:20 - pico-train - INFO - โโโ Loss: 6.2636 |
|
2025-08-29 23:30:20 - pico-train - INFO - โโโ Learning Rate: 2.81e-05 |
|
2025-08-29 23:30:20 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:30:32 - pico-train - INFO - Step 22750 -- ๐ Training Metrics |
|
2025-08-29 23:30:32 - pico-train - INFO - โโโ Loss: 6.3870 |
|
2025-08-29 23:30:32 - pico-train - INFO - โโโ Learning Rate: 2.81e-05 |
|
2025-08-29 23:30:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:30:45 - pico-train - INFO - Step 22775 -- ๐ Training Metrics |
|
2025-08-29 23:30:45 - pico-train - INFO - โโโ Loss: 6.3157 |
|
2025-08-29 23:30:45 - pico-train - INFO - โโโ Learning Rate: 2.80e-05 |
|
2025-08-29 23:30:45 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:30:57 - pico-train - INFO - Step 22800 -- ๐ Training Metrics |
|
2025-08-29 23:30:57 - pico-train - INFO - โโโ Loss: 6.3617 |
|
2025-08-29 23:30:57 - pico-train - INFO - โโโ Learning Rate: 2.79e-05 |
|
2025-08-29 23:30:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:31:10 - pico-train - INFO - Step 22825 -- ๐ Training Metrics |
|
2025-08-29 23:31:10 - pico-train - INFO - โโโ Loss: 6.3006 |
|
2025-08-29 23:31:10 - pico-train - INFO - โโโ Learning Rate: 2.79e-05 |
|
2025-08-29 23:31:10 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:31:23 - pico-train - INFO - Step 22850 -- ๐ Training Metrics |
|
2025-08-29 23:31:23 - pico-train - INFO - โโโ Loss: 6.2552 |
|
2025-08-29 23:31:23 - pico-train - INFO - โโโ Learning Rate: 2.78e-05 |
|
2025-08-29 23:31:23 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:31:35 - pico-train - INFO - Step 22875 -- ๐ Training Metrics |
|
2025-08-29 23:31:35 - pico-train - INFO - โโโ Loss: 6.3537 |
|
2025-08-29 23:31:35 - pico-train - INFO - โโโ Learning Rate: 2.78e-05 |
|
2025-08-29 23:31:35 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:31:48 - pico-train - INFO - Step 22900 -- ๐ Training Metrics |
|
2025-08-29 23:31:48 - pico-train - INFO - โโโ Loss: 6.4096 |
|
2025-08-29 23:31:48 - pico-train - INFO - โโโ Learning Rate: 2.77e-05 |
|
2025-08-29 23:31:48 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:32:01 - pico-train - INFO - Step 22925 -- ๐ Training Metrics |
|
2025-08-29 23:32:01 - pico-train - INFO - โโโ Loss: 6.2037 |
|
2025-08-29 23:32:01 - pico-train - INFO - โโโ Learning Rate: 2.76e-05 |
|
2025-08-29 23:32:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:32:13 - pico-train - INFO - Step 22950 -- ๐ Training Metrics |
|
2025-08-29 23:32:13 - pico-train - INFO - โโโ Loss: 6.3007 |
|
2025-08-29 23:32:13 - pico-train - INFO - โโโ Learning Rate: 2.76e-05 |
|
2025-08-29 23:32:13 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:32:26 - pico-train - INFO - Step 22975 -- ๐ Training Metrics |
|
2025-08-29 23:32:26 - pico-train - INFO - โโโ Loss: 6.2575 |
|
2025-08-29 23:32:26 - pico-train - INFO - โโโ Learning Rate: 2.75e-05 |
|
2025-08-29 23:32:26 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:32:38 - pico-train - INFO - Step 23000 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:34:52 - pico-train - INFO - Step 23000 -- ๐ Evaluation Results |
|
2025-08-29 23:34:52 - pico-train - INFO - โโโ paloma: 1.3786488388612157e+25 |
|
2025-08-29 23:34:53 - pico-train - INFO - Step 23000 -- ๐ Training Metrics |
|
2025-08-29 23:34:53 - pico-train - INFO - โโโ Loss: 6.4702 |
|
2025-08-29 23:34:53 - pico-train - INFO - โโโ Learning Rate: 2.75e-05 |
|
2025-08-29 23:34:53 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:34:53 - pico-train - INFO - Step 23000 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:35:08 - pico-train - INFO - Step 23025 -- ๐ Training Metrics |
|
2025-08-29 23:35:08 - pico-train - INFO - โโโ Loss: 6.3198 |
|
2025-08-29 23:35:08 - pico-train - INFO - โโโ Learning Rate: 2.74e-05 |
|
2025-08-29 23:35:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:35:21 - pico-train - INFO - Step 23050 -- ๐ Training Metrics |
|
2025-08-29 23:35:21 - pico-train - INFO - โโโ Loss: 6.3015 |
|
2025-08-29 23:35:21 - pico-train - INFO - โโโ Learning Rate: 2.73e-05 |
|
2025-08-29 23:35:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:35:33 - pico-train - INFO - Step 23075 -- ๐ Training Metrics |
|
2025-08-29 23:35:33 - pico-train - INFO - โโโ Loss: 6.3222 |
|
2025-08-29 23:35:33 - pico-train - INFO - โโโ Learning Rate: 2.73e-05 |
|
2025-08-29 23:35:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:35:46 - pico-train - INFO - Step 23100 -- ๐ Training Metrics |
|
2025-08-29 23:35:46 - pico-train - INFO - โโโ Loss: 6.2917 |
|
2025-08-29 23:35:46 - pico-train - INFO - โโโ Learning Rate: 2.72e-05 |
|
2025-08-29 23:35:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:35:59 - pico-train - INFO - Step 23125 -- ๐ Training Metrics |
|
2025-08-29 23:35:59 - pico-train - INFO - โโโ Loss: 6.3574 |
|
2025-08-29 23:35:59 - pico-train - INFO - โโโ Learning Rate: 2.71e-05 |
|
2025-08-29 23:35:59 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:36:11 - pico-train - INFO - Step 23150 -- ๐ Training Metrics |
|
2025-08-29 23:36:11 - pico-train - INFO - โโโ Loss: 6.2434 |
|
2025-08-29 23:36:11 - pico-train - INFO - โโโ Learning Rate: 2.71e-05 |
|
2025-08-29 23:36:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:36:24 - pico-train - INFO - Step 23175 -- ๐ Training Metrics |
|
2025-08-29 23:36:24 - pico-train - INFO - โโโ Loss: 6.2580 |
|
2025-08-29 23:36:24 - pico-train - INFO - โโโ Learning Rate: 2.70e-05 |
|
2025-08-29 23:36:24 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:36:36 - pico-train - INFO - Step 23200 -- ๐ Training Metrics |
|
2025-08-29 23:36:36 - pico-train - INFO - โโโ Loss: 6.3214 |
|
2025-08-29 23:36:36 - pico-train - INFO - โโโ Learning Rate: 2.70e-05 |
|
2025-08-29 23:36:36 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:36:49 - pico-train - INFO - Step 23225 -- ๐ Training Metrics |
|
2025-08-29 23:36:49 - pico-train - INFO - โโโ Loss: 6.2731 |
|
2025-08-29 23:36:49 - pico-train - INFO - โโโ Learning Rate: 2.69e-05 |
|
2025-08-29 23:36:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:37:02 - pico-train - INFO - Step 23250 -- ๐ Training Metrics |
|
2025-08-29 23:37:02 - pico-train - INFO - โโโ Loss: 6.3255 |
|
2025-08-29 23:37:02 - pico-train - INFO - โโโ Learning Rate: 2.68e-05 |
|
2025-08-29 23:37:02 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:37:14 - pico-train - INFO - Step 23275 -- ๐ Training Metrics |
|
2025-08-29 23:37:14 - pico-train - INFO - โโโ Loss: 6.3348 |
|
2025-08-29 23:37:14 - pico-train - INFO - โโโ Learning Rate: 2.68e-05 |
|
2025-08-29 23:37:14 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:37:27 - pico-train - INFO - Step 23300 -- ๐ Training Metrics |
|
2025-08-29 23:37:27 - pico-train - INFO - โโโ Loss: 6.3476 |
|
2025-08-29 23:37:27 - pico-train - INFO - โโโ Learning Rate: 2.67e-05 |
|
2025-08-29 23:37:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:37:39 - pico-train - INFO - Step 23325 -- ๐ Training Metrics |
|
2025-08-29 23:37:39 - pico-train - INFO - โโโ Loss: 6.3392 |
|
2025-08-29 23:37:39 - pico-train - INFO - โโโ Learning Rate: 2.67e-05 |
|
2025-08-29 23:37:39 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:37:52 - pico-train - INFO - Step 23350 -- ๐ Training Metrics |
|
2025-08-29 23:37:52 - pico-train - INFO - โโโ Loss: 6.3051 |
|
2025-08-29 23:37:52 - pico-train - INFO - โโโ Learning Rate: 2.66e-05 |
|
2025-08-29 23:37:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:38:05 - pico-train - INFO - Step 23375 -- ๐ Training Metrics |
|
2025-08-29 23:38:05 - pico-train - INFO - โโโ Loss: 6.2683 |
|
2025-08-29 23:38:05 - pico-train - INFO - โโโ Learning Rate: 2.65e-05 |
|
2025-08-29 23:38:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:38:17 - pico-train - INFO - Step 23400 -- ๐ Training Metrics |
|
2025-08-29 23:38:17 - pico-train - INFO - โโโ Loss: 6.2929 |
|
2025-08-29 23:38:17 - pico-train - INFO - โโโ Learning Rate: 2.65e-05 |
|
2025-08-29 23:38:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:38:30 - pico-train - INFO - Step 23425 -- ๐ Training Metrics |
|
2025-08-29 23:38:30 - pico-train - INFO - โโโ Loss: 6.3546 |
|
2025-08-29 23:38:30 - pico-train - INFO - โโโ Learning Rate: 2.64e-05 |
|
2025-08-29 23:38:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:38:42 - pico-train - INFO - Step 23450 -- ๐ Training Metrics |
|
2025-08-29 23:38:42 - pico-train - INFO - โโโ Loss: 6.3572 |
|
2025-08-29 23:38:42 - pico-train - INFO - โโโ Learning Rate: 2.63e-05 |
|
2025-08-29 23:38:42 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:38:55 - pico-train - INFO - Step 23475 -- ๐ Training Metrics |
|
2025-08-29 23:38:55 - pico-train - INFO - โโโ Loss: 6.2350 |
|
2025-08-29 23:38:55 - pico-train - INFO - โโโ Learning Rate: 2.63e-05 |
|
2025-08-29 23:38:55 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:39:07 - pico-train - INFO - Step 23500 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:41:03 - pico-train - INFO - Step 23500 -- ๐ Evaluation Results |
|
2025-08-29 23:41:03 - pico-train - INFO - โโโ paloma: 1.5734245831645979e+25 |
|
2025-08-29 23:41:04 - pico-train - INFO - Step 23500 -- ๐ Training Metrics |
|
2025-08-29 23:41:04 - pico-train - INFO - โโโ Loss: 6.3544 |
|
2025-08-29 23:41:04 - pico-train - INFO - โโโ Learning Rate: 2.62e-05 |
|
2025-08-29 23:41:04 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:41:04 - pico-train - INFO - Step 23500 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:41:19 - pico-train - INFO - Step 23525 -- ๐ Training Metrics |
|
2025-08-29 23:41:19 - pico-train - INFO - โโโ Loss: 6.2607 |
|
2025-08-29 23:41:19 - pico-train - INFO - โโโ Learning Rate: 2.62e-05 |
|
2025-08-29 23:41:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:41:32 - pico-train - INFO - Step 23550 -- ๐ Training Metrics |
|
2025-08-29 23:41:32 - pico-train - INFO - โโโ Loss: 6.2912 |
|
2025-08-29 23:41:32 - pico-train - INFO - โโโ Learning Rate: 2.61e-05 |
|
2025-08-29 23:41:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:41:45 - pico-train - INFO - Step 23575 -- ๐ Training Metrics |
|
2025-08-29 23:41:45 - pico-train - INFO - โโโ Loss: 6.2348 |
|
2025-08-29 23:41:45 - pico-train - INFO - โโโ Learning Rate: 2.60e-05 |
|
2025-08-29 23:41:45 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:41:57 - pico-train - INFO - Step 23600 -- ๐ Training Metrics |
|
2025-08-29 23:41:57 - pico-train - INFO - โโโ Loss: 6.2372 |
|
2025-08-29 23:41:57 - pico-train - INFO - โโโ Learning Rate: 2.60e-05 |
|
2025-08-29 23:41:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:42:10 - pico-train - INFO - Step 23625 -- ๐ Training Metrics |
|
2025-08-29 23:42:10 - pico-train - INFO - โโโ Loss: 6.3467 |
|
2025-08-29 23:42:10 - pico-train - INFO - โโโ Learning Rate: 2.59e-05 |
|
2025-08-29 23:42:10 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:42:22 - pico-train - INFO - Step 23650 -- ๐ Training Metrics |
|
2025-08-29 23:42:22 - pico-train - INFO - โโโ Loss: 6.2611 |
|
2025-08-29 23:42:22 - pico-train - INFO - โโโ Learning Rate: 2.59e-05 |
|
2025-08-29 23:42:22 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:42:35 - pico-train - INFO - Step 23675 -- ๐ Training Metrics |
|
2025-08-29 23:42:35 - pico-train - INFO - โโโ Loss: 6.2587 |
|
2025-08-29 23:42:35 - pico-train - INFO - โโโ Learning Rate: 2.58e-05 |
|
2025-08-29 23:42:35 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:42:47 - pico-train - INFO - Step 23700 -- ๐ Training Metrics |
|
2025-08-29 23:42:47 - pico-train - INFO - โโโ Loss: 6.3048 |
|
2025-08-29 23:42:47 - pico-train - INFO - โโโ Learning Rate: 2.57e-05 |
|
2025-08-29 23:42:47 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:43:00 - pico-train - INFO - Step 23725 -- ๐ Training Metrics |
|
2025-08-29 23:43:00 - pico-train - INFO - โโโ Loss: 6.2627 |
|
2025-08-29 23:43:00 - pico-train - INFO - โโโ Learning Rate: 2.57e-05 |
|
2025-08-29 23:43:00 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:43:13 - pico-train - INFO - Step 23750 -- ๐ Training Metrics |
|
2025-08-29 23:43:13 - pico-train - INFO - โโโ Loss: 6.2880 |
|
2025-08-29 23:43:13 - pico-train - INFO - โโโ Learning Rate: 2.56e-05 |
|
2025-08-29 23:43:13 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:43:25 - pico-train - INFO - Step 23775 -- ๐ Training Metrics |
|
2025-08-29 23:43:25 - pico-train - INFO - โโโ Loss: 6.3205 |
|
2025-08-29 23:43:25 - pico-train - INFO - โโโ Learning Rate: 2.56e-05 |
|
2025-08-29 23:43:25 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:43:38 - pico-train - INFO - Step 23800 -- ๐ Training Metrics |
|
2025-08-29 23:43:38 - pico-train - INFO - โโโ Loss: 6.2730 |
|
2025-08-29 23:43:38 - pico-train - INFO - โโโ Learning Rate: 2.55e-05 |
|
2025-08-29 23:43:38 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:43:51 - pico-train - INFO - Step 23825 -- ๐ Training Metrics |
|
2025-08-29 23:43:51 - pico-train - INFO - โโโ Loss: 6.2649 |
|
2025-08-29 23:43:51 - pico-train - INFO - โโโ Learning Rate: 2.54e-05 |
|
2025-08-29 23:43:51 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:44:03 - pico-train - INFO - Step 23850 -- ๐ Training Metrics |
|
2025-08-29 23:44:03 - pico-train - INFO - โโโ Loss: 6.2840 |
|
2025-08-29 23:44:03 - pico-train - INFO - โโโ Learning Rate: 2.54e-05 |
|
2025-08-29 23:44:03 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:44:16 - pico-train - INFO - Step 23875 -- ๐ Training Metrics |
|
2025-08-29 23:44:16 - pico-train - INFO - โโโ Loss: 6.3253 |
|
2025-08-29 23:44:16 - pico-train - INFO - โโโ Learning Rate: 2.53e-05 |
|
2025-08-29 23:44:16 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:44:28 - pico-train - INFO - Step 23900 -- ๐ Training Metrics |
|
2025-08-29 23:44:28 - pico-train - INFO - โโโ Loss: 6.3487 |
|
2025-08-29 23:44:28 - pico-train - INFO - โโโ Learning Rate: 2.52e-05 |
|
2025-08-29 23:44:28 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:44:41 - pico-train - INFO - Step 23925 -- ๐ Training Metrics |
|
2025-08-29 23:44:41 - pico-train - INFO - โโโ Loss: 6.2998 |
|
2025-08-29 23:44:41 - pico-train - INFO - โโโ Learning Rate: 2.52e-05 |
|
2025-08-29 23:44:41 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:44:54 - pico-train - INFO - Step 23950 -- ๐ Training Metrics |
|
2025-08-29 23:44:54 - pico-train - INFO - โโโ Loss: 6.2444 |
|
2025-08-29 23:44:54 - pico-train - INFO - โโโ Learning Rate: 2.51e-05 |
|
2025-08-29 23:44:54 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:45:06 - pico-train - INFO - Step 23975 -- ๐ Training Metrics |
|
2025-08-29 23:45:06 - pico-train - INFO - โโโ Loss: 6.2611 |
|
2025-08-29 23:45:06 - pico-train - INFO - โโโ Learning Rate: 2.51e-05 |
|
2025-08-29 23:45:06 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:45:18 - pico-train - INFO - Step 24000 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:47:14 - pico-train - INFO - Step 24000 -- ๐ Evaluation Results |
|
2025-08-29 23:47:14 - pico-train - INFO - โโโ paloma: 2.548011467855507e+25 |
|
2025-08-29 23:47:17 - pico-train - INFO - Step 24000 -- ๐ Training Metrics |
|
2025-08-29 23:47:17 - pico-train - INFO - โโโ Loss: 6.1774 |
|
2025-08-29 23:47:17 - pico-train - INFO - โโโ Learning Rate: 2.50e-05 |
|
2025-08-29 23:47:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:47:17 - pico-train - INFO - Step 24000 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:47:32 - pico-train - INFO - Step 24025 -- ๐ Training Metrics |
|
2025-08-29 23:47:32 - pico-train - INFO - โโโ Loss: 6.2658 |
|
2025-08-29 23:47:32 - pico-train - INFO - โโโ Learning Rate: 2.49e-05 |
|
2025-08-29 23:47:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:47:44 - pico-train - INFO - Step 24050 -- ๐ Training Metrics |
|
2025-08-29 23:47:44 - pico-train - INFO - โโโ Loss: 6.2641 |
|
2025-08-29 23:47:44 - pico-train - INFO - โโโ Learning Rate: 2.49e-05 |
|
2025-08-29 23:47:44 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:47:57 - pico-train - INFO - Step 24075 -- ๐ Training Metrics |
|
2025-08-29 23:47:57 - pico-train - INFO - โโโ Loss: 6.1837 |
|
2025-08-29 23:47:57 - pico-train - INFO - โโโ Learning Rate: 2.48e-05 |
|
2025-08-29 23:47:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:48:10 - pico-train - INFO - Step 24100 -- ๐ Training Metrics |
|
2025-08-29 23:48:10 - pico-train - INFO - โโโ Loss: 6.3345 |
|
2025-08-29 23:48:10 - pico-train - INFO - โโโ Learning Rate: 2.48e-05 |
|
2025-08-29 23:48:10 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:48:23 - pico-train - INFO - Step 24125 -- ๐ Training Metrics |
|
2025-08-29 23:48:23 - pico-train - INFO - โโโ Loss: 6.2665 |
|
2025-08-29 23:48:23 - pico-train - INFO - โโโ Learning Rate: 2.47e-05 |
|
2025-08-29 23:48:23 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:48:35 - pico-train - INFO - Step 24150 -- ๐ Training Metrics |
|
2025-08-29 23:48:35 - pico-train - INFO - โโโ Loss: 6.2894 |
|
2025-08-29 23:48:35 - pico-train - INFO - โโโ Learning Rate: 2.46e-05 |
|
2025-08-29 23:48:35 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:48:48 - pico-train - INFO - Step 24175 -- ๐ Training Metrics |
|
2025-08-29 23:48:48 - pico-train - INFO - โโโ Loss: 6.2354 |
|
2025-08-29 23:48:48 - pico-train - INFO - โโโ Learning Rate: 2.46e-05 |
|
2025-08-29 23:48:48 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:49:00 - pico-train - INFO - Step 24200 -- ๐ Training Metrics |
|
2025-08-29 23:49:00 - pico-train - INFO - โโโ Loss: 6.2110 |
|
2025-08-29 23:49:00 - pico-train - INFO - โโโ Learning Rate: 2.45e-05 |
|
2025-08-29 23:49:00 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:49:13 - pico-train - INFO - Step 24225 -- ๐ Training Metrics |
|
2025-08-29 23:49:13 - pico-train - INFO - โโโ Loss: 6.2512 |
|
2025-08-29 23:49:13 - pico-train - INFO - โโโ Learning Rate: 2.44e-05 |
|
2025-08-29 23:49:13 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:49:25 - pico-train - INFO - Step 24250 -- ๐ Training Metrics |
|
2025-08-29 23:49:25 - pico-train - INFO - โโโ Loss: 6.2544 |
|
2025-08-29 23:49:25 - pico-train - INFO - โโโ Learning Rate: 2.44e-05 |
|
2025-08-29 23:49:25 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:49:38 - pico-train - INFO - Step 24275 -- ๐ Training Metrics |
|
2025-08-29 23:49:38 - pico-train - INFO - โโโ Loss: 6.2934 |
|
2025-08-29 23:49:38 - pico-train - INFO - โโโ Learning Rate: 2.43e-05 |
|
2025-08-29 23:49:38 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:49:51 - pico-train - INFO - Step 24300 -- ๐ Training Metrics |
|
2025-08-29 23:49:51 - pico-train - INFO - โโโ Loss: 6.2608 |
|
2025-08-29 23:49:51 - pico-train - INFO - โโโ Learning Rate: 2.43e-05 |
|
2025-08-29 23:49:51 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:50:03 - pico-train - INFO - Step 24325 -- ๐ Training Metrics |
|
2025-08-29 23:50:03 - pico-train - INFO - โโโ Loss: 6.2280 |
|
2025-08-29 23:50:03 - pico-train - INFO - โโโ Learning Rate: 2.42e-05 |
|
2025-08-29 23:50:03 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:50:16 - pico-train - INFO - Step 24350 -- ๐ Training Metrics |
|
2025-08-29 23:50:16 - pico-train - INFO - โโโ Loss: 6.2431 |
|
2025-08-29 23:50:16 - pico-train - INFO - โโโ Learning Rate: 2.41e-05 |
|
2025-08-29 23:50:16 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:50:29 - pico-train - INFO - Step 24375 -- ๐ Training Metrics |
|
2025-08-29 23:50:29 - pico-train - INFO - โโโ Loss: 6.2120 |
|
2025-08-29 23:50:29 - pico-train - INFO - โโโ Learning Rate: 2.41e-05 |
|
2025-08-29 23:50:29 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:50:41 - pico-train - INFO - Step 24400 -- ๐ Training Metrics |
|
2025-08-29 23:50:41 - pico-train - INFO - โโโ Loss: 6.2375 |
|
2025-08-29 23:50:41 - pico-train - INFO - โโโ Learning Rate: 2.40e-05 |
|
2025-08-29 23:50:41 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:50:54 - pico-train - INFO - Step 24425 -- ๐ Training Metrics |
|
2025-08-29 23:50:54 - pico-train - INFO - โโโ Loss: 6.3604 |
|
2025-08-29 23:50:54 - pico-train - INFO - โโโ Learning Rate: 2.40e-05 |
|
2025-08-29 23:50:54 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:51:07 - pico-train - INFO - Step 24450 -- ๐ Training Metrics |
|
2025-08-29 23:51:07 - pico-train - INFO - โโโ Loss: 6.2451 |
|
2025-08-29 23:51:07 - pico-train - INFO - โโโ Learning Rate: 2.39e-05 |
|
2025-08-29 23:51:07 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:51:20 - pico-train - INFO - Step 24475 -- ๐ Training Metrics |
|
2025-08-29 23:51:20 - pico-train - INFO - โโโ Loss: 6.2877 |
|
2025-08-29 23:51:20 - pico-train - INFO - โโโ Learning Rate: 2.38e-05 |
|
2025-08-29 23:51:20 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:51:32 - pico-train - INFO - Step 24500 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:53:26 - pico-train - INFO - Step 24500 -- ๐ Evaluation Results |
|
2025-08-29 23:53:26 - pico-train - INFO - โโโ paloma: 2.937466297559389e+25 |
|
2025-08-29 23:53:29 - pico-train - INFO - Step 24500 -- ๐ Training Metrics |
|
2025-08-29 23:53:29 - pico-train - INFO - โโโ Loss: 6.3104 |
|
2025-08-29 23:53:29 - pico-train - INFO - โโโ Learning Rate: 2.38e-05 |
|
2025-08-29 23:53:29 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:53:29 - pico-train - INFO - Step 24500 -- ๐ Saving Learning Dynamics |
|
2025-08-29 23:53:44 - pico-train - INFO - Step 24525 -- ๐ Training Metrics |
|
2025-08-29 23:53:44 - pico-train - INFO - โโโ Loss: 6.2830 |
|
2025-08-29 23:53:44 - pico-train - INFO - โโโ Learning Rate: 2.37e-05 |
|
2025-08-29 23:53:44 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:53:56 - pico-train - INFO - Step 24550 -- ๐ Training Metrics |
|
2025-08-29 23:53:56 - pico-train - INFO - โโโ Loss: 6.2558 |
|
2025-08-29 23:53:56 - pico-train - INFO - โโโ Learning Rate: 2.37e-05 |
|
2025-08-29 23:53:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:54:09 - pico-train - INFO - Step 24575 -- ๐ Training Metrics |
|
2025-08-29 23:54:09 - pico-train - INFO - โโโ Loss: 6.2140 |
|
2025-08-29 23:54:09 - pico-train - INFO - โโโ Learning Rate: 2.36e-05 |
|
2025-08-29 23:54:09 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:54:22 - pico-train - INFO - Step 24600 -- ๐ Training Metrics |
|
2025-08-29 23:54:22 - pico-train - INFO - โโโ Loss: 6.2546 |
|
2025-08-29 23:54:22 - pico-train - INFO - โโโ Learning Rate: 2.35e-05 |
|
2025-08-29 23:54:22 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:54:34 - pico-train - INFO - Step 24625 -- ๐ Training Metrics |
|
2025-08-29 23:54:34 - pico-train - INFO - โโโ Loss: 6.2569 |
|
2025-08-29 23:54:34 - pico-train - INFO - โโโ Learning Rate: 2.35e-05 |
|
2025-08-29 23:54:34 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:54:47 - pico-train - INFO - Step 24650 -- ๐ Training Metrics |
|
2025-08-29 23:54:47 - pico-train - INFO - โโโ Loss: 6.2170 |
|
2025-08-29 23:54:47 - pico-train - INFO - โโโ Learning Rate: 2.34e-05 |
|
2025-08-29 23:54:47 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:55:00 - pico-train - INFO - Step 24675 -- ๐ Training Metrics |
|
2025-08-29 23:55:00 - pico-train - INFO - โโโ Loss: 6.2187 |
|
2025-08-29 23:55:00 - pico-train - INFO - โโโ Learning Rate: 2.33e-05 |
|
2025-08-29 23:55:00 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:55:12 - pico-train - INFO - Step 24700 -- ๐ Training Metrics |
|
2025-08-29 23:55:12 - pico-train - INFO - โโโ Loss: 6.2933 |
|
2025-08-29 23:55:12 - pico-train - INFO - โโโ Learning Rate: 2.33e-05 |
|
2025-08-29 23:55:12 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:55:25 - pico-train - INFO - Step 24725 -- ๐ Training Metrics |
|
2025-08-29 23:55:25 - pico-train - INFO - โโโ Loss: 6.2359 |
|
2025-08-29 23:55:25 - pico-train - INFO - โโโ Learning Rate: 2.32e-05 |
|
2025-08-29 23:55:25 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:55:38 - pico-train - INFO - Step 24750 -- ๐ Training Metrics |
|
2025-08-29 23:55:38 - pico-train - INFO - โโโ Loss: 6.2789 |
|
2025-08-29 23:55:38 - pico-train - INFO - โโโ Learning Rate: 2.32e-05 |
|
2025-08-29 23:55:38 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:55:50 - pico-train - INFO - Step 24775 -- ๐ Training Metrics |
|
2025-08-29 23:55:50 - pico-train - INFO - โโโ Loss: 6.3001 |
|
2025-08-29 23:55:50 - pico-train - INFO - โโโ Learning Rate: 2.31e-05 |
|
2025-08-29 23:55:50 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:56:03 - pico-train - INFO - Step 24800 -- ๐ Training Metrics |
|
2025-08-29 23:56:03 - pico-train - INFO - โโโ Loss: 6.2419 |
|
2025-08-29 23:56:03 - pico-train - INFO - โโโ Learning Rate: 2.30e-05 |
|
2025-08-29 23:56:03 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:56:16 - pico-train - INFO - Step 24825 -- ๐ Training Metrics |
|
2025-08-29 23:56:16 - pico-train - INFO - โโโ Loss: 6.2251 |
|
2025-08-29 23:56:16 - pico-train - INFO - โโโ Learning Rate: 2.30e-05 |
|
2025-08-29 23:56:16 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:56:28 - pico-train - INFO - Step 24850 -- ๐ Training Metrics |
|
2025-08-29 23:56:28 - pico-train - INFO - โโโ Loss: 6.2023 |
|
2025-08-29 23:56:28 - pico-train - INFO - โโโ Learning Rate: 2.29e-05 |
|
2025-08-29 23:56:28 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:56:41 - pico-train - INFO - Step 24875 -- ๐ Training Metrics |
|
2025-08-29 23:56:41 - pico-train - INFO - โโโ Loss: 6.2911 |
|
2025-08-29 23:56:41 - pico-train - INFO - โโโ Learning Rate: 2.29e-05 |
|
2025-08-29 23:56:41 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:56:54 - pico-train - INFO - Step 24900 -- ๐ Training Metrics |
|
2025-08-29 23:56:54 - pico-train - INFO - โโโ Loss: 6.2723 |
|
2025-08-29 23:56:54 - pico-train - INFO - โโโ Learning Rate: 2.28e-05 |
|
2025-08-29 23:56:54 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:57:07 - pico-train - INFO - Step 24925 -- ๐ Training Metrics |
|
2025-08-29 23:57:07 - pico-train - INFO - โโโ Loss: 6.2993 |
|
2025-08-29 23:57:07 - pico-train - INFO - โโโ Learning Rate: 2.27e-05 |
|
2025-08-29 23:57:07 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:57:19 - pico-train - INFO - Step 24950 -- ๐ Training Metrics |
|
2025-08-29 23:57:19 - pico-train - INFO - โโโ Loss: 6.2579 |
|
2025-08-29 23:57:19 - pico-train - INFO - โโโ Learning Rate: 2.27e-05 |
|
2025-08-29 23:57:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:57:32 - pico-train - INFO - Step 24975 -- ๐ Training Metrics |
|
2025-08-29 23:57:32 - pico-train - INFO - โโโ Loss: 6.2620 |
|
2025-08-29 23:57:32 - pico-train - INFO - โโโ Learning Rate: 2.26e-05 |
|
2025-08-29 23:57:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:57:44 - pico-train - INFO - Step 25000 -- ๐พ Saving Checkpoint |
|
2025-08-29 23:59:48 - pico-train - INFO - Step 25000 -- ๐ Evaluation Results |
|
2025-08-29 23:59:48 - pico-train - INFO - โโโ paloma: 3.4105304760288245e+25 |
|
2025-08-29 23:59:49 - pico-train - INFO - Step 25000 -- ๐ Training Metrics |
|
2025-08-29 23:59:49 - pico-train - INFO - โโโ Loss: 6.2956 |
|
2025-08-29 23:59:49 - pico-train - INFO - โโโ Learning Rate: 2.25e-05 |
|
2025-08-29 23:59:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-29 23:59:49 - pico-train - INFO - Step 25000 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:00:04 - pico-train - INFO - Step 25025 -- ๐ Training Metrics |
|
2025-08-30 00:00:04 - pico-train - INFO - โโโ Loss: 6.2348 |
|
2025-08-30 00:00:04 - pico-train - INFO - โโโ Learning Rate: 2.25e-05 |
|
2025-08-30 00:00:04 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:00:17 - pico-train - INFO - Step 25050 -- ๐ Training Metrics |
|
2025-08-30 00:00:17 - pico-train - INFO - โโโ Loss: 6.2363 |
|
2025-08-30 00:00:17 - pico-train - INFO - โโโ Learning Rate: 2.24e-05 |
|
2025-08-30 00:00:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:00:30 - pico-train - INFO - Step 25075 -- ๐ Training Metrics |
|
2025-08-30 00:00:30 - pico-train - INFO - โโโ Loss: 6.2567 |
|
2025-08-30 00:00:30 - pico-train - INFO - โโโ Learning Rate: 2.24e-05 |
|
2025-08-30 00:00:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:00:43 - pico-train - INFO - Step 25100 -- ๐ Training Metrics |
|
2025-08-30 00:00:43 - pico-train - INFO - โโโ Loss: 6.2186 |
|
2025-08-30 00:00:43 - pico-train - INFO - โโโ Learning Rate: 2.23e-05 |
|
2025-08-30 00:00:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:00:56 - pico-train - INFO - Step 25125 -- ๐ Training Metrics |
|
2025-08-30 00:00:56 - pico-train - INFO - โโโ Loss: 6.2886 |
|
2025-08-30 00:00:56 - pico-train - INFO - โโโ Learning Rate: 2.22e-05 |
|
2025-08-30 00:00:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:01:08 - pico-train - INFO - Step 25150 -- ๐ Training Metrics |
|
2025-08-30 00:01:08 - pico-train - INFO - โโโ Loss: 6.2310 |
|
2025-08-30 00:01:08 - pico-train - INFO - โโโ Learning Rate: 2.22e-05 |
|
2025-08-30 00:01:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:01:21 - pico-train - INFO - Step 25175 -- ๐ Training Metrics |
|
2025-08-30 00:01:21 - pico-train - INFO - โโโ Loss: 6.3884 |
|
2025-08-30 00:01:21 - pico-train - INFO - โโโ Learning Rate: 2.21e-05 |
|
2025-08-30 00:01:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:01:34 - pico-train - INFO - Step 25200 -- ๐ Training Metrics |
|
2025-08-30 00:01:34 - pico-train - INFO - โโโ Loss: 6.2232 |
|
2025-08-30 00:01:34 - pico-train - INFO - โโโ Learning Rate: 2.21e-05 |
|
2025-08-30 00:01:34 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:01:46 - pico-train - INFO - Step 25225 -- ๐ Training Metrics |
|
2025-08-30 00:01:46 - pico-train - INFO - โโโ Loss: 6.2254 |
|
2025-08-30 00:01:46 - pico-train - INFO - โโโ Learning Rate: 2.20e-05 |
|
2025-08-30 00:01:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:01:59 - pico-train - INFO - Step 25250 -- ๐ Training Metrics |
|
2025-08-30 00:01:59 - pico-train - INFO - โโโ Loss: 6.2140 |
|
2025-08-30 00:01:59 - pico-train - INFO - โโโ Learning Rate: 2.19e-05 |
|
2025-08-30 00:01:59 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:02:12 - pico-train - INFO - Step 25275 -- ๐ Training Metrics |
|
2025-08-30 00:02:12 - pico-train - INFO - โโโ Loss: 6.3619 |
|
2025-08-30 00:02:12 - pico-train - INFO - โโโ Learning Rate: 2.19e-05 |
|
2025-08-30 00:02:12 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:02:24 - pico-train - INFO - Step 25300 -- ๐ Training Metrics |
|
2025-08-30 00:02:24 - pico-train - INFO - โโโ Loss: 6.2660 |
|
2025-08-30 00:02:24 - pico-train - INFO - โโโ Learning Rate: 2.18e-05 |
|
2025-08-30 00:02:24 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:02:37 - pico-train - INFO - Step 25325 -- ๐ Training Metrics |
|
2025-08-30 00:02:37 - pico-train - INFO - โโโ Loss: 6.1959 |
|
2025-08-30 00:02:37 - pico-train - INFO - โโโ Learning Rate: 2.18e-05 |
|
2025-08-30 00:02:37 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:02:49 - pico-train - INFO - Step 25350 -- ๐ Training Metrics |
|
2025-08-30 00:02:49 - pico-train - INFO - โโโ Loss: 6.2983 |
|
2025-08-30 00:02:49 - pico-train - INFO - โโโ Learning Rate: 2.17e-05 |
|
2025-08-30 00:02:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:03:02 - pico-train - INFO - Step 25375 -- ๐ Training Metrics |
|
2025-08-30 00:03:02 - pico-train - INFO - โโโ Loss: 6.2441 |
|
2025-08-30 00:03:02 - pico-train - INFO - โโโ Learning Rate: 2.16e-05 |
|
2025-08-30 00:03:02 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:03:15 - pico-train - INFO - Step 25400 -- ๐ Training Metrics |
|
2025-08-30 00:03:15 - pico-train - INFO - โโโ Loss: 6.2454 |
|
2025-08-30 00:03:15 - pico-train - INFO - โโโ Learning Rate: 2.16e-05 |
|
2025-08-30 00:03:15 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:03:28 - pico-train - INFO - Step 25425 -- ๐ Training Metrics |
|
2025-08-30 00:03:28 - pico-train - INFO - โโโ Loss: 6.2099 |
|
2025-08-30 00:03:28 - pico-train - INFO - โโโ Learning Rate: 2.15e-05 |
|
2025-08-30 00:03:28 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:03:40 - pico-train - INFO - Step 25450 -- ๐ Training Metrics |
|
2025-08-30 00:03:40 - pico-train - INFO - โโโ Loss: 6.1991 |
|
2025-08-30 00:03:40 - pico-train - INFO - โโโ Learning Rate: 2.15e-05 |
|
2025-08-30 00:03:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:03:53 - pico-train - INFO - Step 25475 -- ๐ Training Metrics |
|
2025-08-30 00:03:53 - pico-train - INFO - โโโ Loss: 6.1905 |
|
2025-08-30 00:03:53 - pico-train - INFO - โโโ Learning Rate: 2.14e-05 |
|
2025-08-30 00:03:53 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:04:05 - pico-train - INFO - Step 25500 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:06:01 - pico-train - INFO - Step 25500 -- ๐ Evaluation Results |
|
2025-08-30 00:06:01 - pico-train - INFO - โโโ paloma: 5.167340298104552e+25 |
|
2025-08-30 00:06:03 - pico-train - INFO - Step 25500 -- ๐ Training Metrics |
|
2025-08-30 00:06:03 - pico-train - INFO - โโโ Loss: 6.2849 |
|
2025-08-30 00:06:03 - pico-train - INFO - โโโ Learning Rate: 2.13e-05 |
|
2025-08-30 00:06:03 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:06:03 - pico-train - INFO - Step 25500 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:06:19 - pico-train - INFO - Step 25525 -- ๐ Training Metrics |
|
2025-08-30 00:06:19 - pico-train - INFO - โโโ Loss: 6.2454 |
|
2025-08-30 00:06:19 - pico-train - INFO - โโโ Learning Rate: 2.13e-05 |
|
2025-08-30 00:06:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:06:32 - pico-train - INFO - Step 25550 -- ๐ Training Metrics |
|
2025-08-30 00:06:32 - pico-train - INFO - โโโ Loss: 6.2327 |
|
2025-08-30 00:06:32 - pico-train - INFO - โโโ Learning Rate: 2.12e-05 |
|
2025-08-30 00:06:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:06:45 - pico-train - INFO - Step 25575 -- ๐ Training Metrics |
|
2025-08-30 00:06:45 - pico-train - INFO - โโโ Loss: 6.2783 |
|
2025-08-30 00:06:45 - pico-train - INFO - โโโ Learning Rate: 2.11e-05 |
|
2025-08-30 00:06:45 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:06:57 - pico-train - INFO - Step 25600 -- ๐ Training Metrics |
|
2025-08-30 00:06:57 - pico-train - INFO - โโโ Loss: 6.1487 |
|
2025-08-30 00:06:57 - pico-train - INFO - โโโ Learning Rate: 2.11e-05 |
|
2025-08-30 00:06:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:07:11 - pico-train - INFO - Step 25625 -- ๐ Training Metrics |
|
2025-08-30 00:07:11 - pico-train - INFO - โโโ Loss: 6.3194 |
|
2025-08-30 00:07:11 - pico-train - INFO - โโโ Learning Rate: 2.10e-05 |
|
2025-08-30 00:07:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:07:24 - pico-train - INFO - Step 25650 -- ๐ Training Metrics |
|
2025-08-30 00:07:24 - pico-train - INFO - โโโ Loss: 6.2920 |
|
2025-08-30 00:07:24 - pico-train - INFO - โโโ Learning Rate: 2.10e-05 |
|
2025-08-30 00:07:24 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:07:37 - pico-train - INFO - Step 25675 -- ๐ Training Metrics |
|
2025-08-30 00:07:37 - pico-train - INFO - โโโ Loss: 6.2623 |
|
2025-08-30 00:07:37 - pico-train - INFO - โโโ Learning Rate: 2.09e-05 |
|
2025-08-30 00:07:37 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:07:49 - pico-train - INFO - Step 25700 -- ๐ Training Metrics |
|
2025-08-30 00:07:49 - pico-train - INFO - โโโ Loss: 6.2687 |
|
2025-08-30 00:07:49 - pico-train - INFO - โโโ Learning Rate: 2.08e-05 |
|
2025-08-30 00:07:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:08:02 - pico-train - INFO - Step 25725 -- ๐ Training Metrics |
|
2025-08-30 00:08:02 - pico-train - INFO - โโโ Loss: 6.2595 |
|
2025-08-30 00:08:02 - pico-train - INFO - โโโ Learning Rate: 2.08e-05 |
|
2025-08-30 00:08:02 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:08:15 - pico-train - INFO - Step 25750 -- ๐ Training Metrics |
|
2025-08-30 00:08:15 - pico-train - INFO - โโโ Loss: 6.2781 |
|
2025-08-30 00:08:15 - pico-train - INFO - โโโ Learning Rate: 2.07e-05 |
|
2025-08-30 00:08:15 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:08:27 - pico-train - INFO - Step 25775 -- ๐ Training Metrics |
|
2025-08-30 00:08:27 - pico-train - INFO - โโโ Loss: 6.2089 |
|
2025-08-30 00:08:27 - pico-train - INFO - โโโ Learning Rate: 2.07e-05 |
|
2025-08-30 00:08:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:08:40 - pico-train - INFO - Step 25800 -- ๐ Training Metrics |
|
2025-08-30 00:08:40 - pico-train - INFO - โโโ Loss: 6.2729 |
|
2025-08-30 00:08:40 - pico-train - INFO - โโโ Learning Rate: 2.06e-05 |
|
2025-08-30 00:08:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:08:53 - pico-train - INFO - Step 25825 -- ๐ Training Metrics |
|
2025-08-30 00:08:53 - pico-train - INFO - โโโ Loss: 6.2478 |
|
2025-08-30 00:08:53 - pico-train - INFO - โโโ Learning Rate: 2.05e-05 |
|
2025-08-30 00:08:53 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:09:05 - pico-train - INFO - Step 25850 -- ๐ Training Metrics |
|
2025-08-30 00:09:05 - pico-train - INFO - โโโ Loss: 6.2238 |
|
2025-08-30 00:09:05 - pico-train - INFO - โโโ Learning Rate: 2.05e-05 |
|
2025-08-30 00:09:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:09:18 - pico-train - INFO - Step 25875 -- ๐ Training Metrics |
|
2025-08-30 00:09:18 - pico-train - INFO - โโโ Loss: 6.2437 |
|
2025-08-30 00:09:18 - pico-train - INFO - โโโ Learning Rate: 2.04e-05 |
|
2025-08-30 00:09:18 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:09:31 - pico-train - INFO - Step 25900 -- ๐ Training Metrics |
|
2025-08-30 00:09:31 - pico-train - INFO - โโโ Loss: 6.2743 |
|
2025-08-30 00:09:31 - pico-train - INFO - โโโ Learning Rate: 2.04e-05 |
|
2025-08-30 00:09:31 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:09:43 - pico-train - INFO - Step 25925 -- ๐ Training Metrics |
|
2025-08-30 00:09:43 - pico-train - INFO - โโโ Loss: 6.2143 |
|
2025-08-30 00:09:43 - pico-train - INFO - โโโ Learning Rate: 2.03e-05 |
|
2025-08-30 00:09:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:09:56 - pico-train - INFO - Step 25950 -- ๐ Training Metrics |
|
2025-08-30 00:09:56 - pico-train - INFO - โโโ Loss: 6.1636 |
|
2025-08-30 00:09:56 - pico-train - INFO - โโโ Learning Rate: 2.02e-05 |
|
2025-08-30 00:09:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:10:08 - pico-train - INFO - Step 25975 -- ๐ Training Metrics |
|
2025-08-30 00:10:08 - pico-train - INFO - โโโ Loss: 6.2028 |
|
2025-08-30 00:10:08 - pico-train - INFO - โโโ Learning Rate: 2.02e-05 |
|
2025-08-30 00:10:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:10:21 - pico-train - INFO - Step 26000 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:12:22 - pico-train - INFO - Step 26000 -- ๐ Evaluation Results |
|
2025-08-30 00:12:22 - pico-train - INFO - โโโ paloma: 5.374017629915336e+25 |
|
2025-08-30 00:12:25 - pico-train - INFO - Step 26000 -- ๐ Training Metrics |
|
2025-08-30 00:12:25 - pico-train - INFO - โโโ Loss: 6.3023 |
|
2025-08-30 00:12:25 - pico-train - INFO - โโโ Learning Rate: 2.01e-05 |
|
2025-08-30 00:12:25 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:12:25 - pico-train - INFO - Step 26000 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:12:40 - pico-train - INFO - Step 26025 -- ๐ Training Metrics |
|
2025-08-30 00:12:40 - pico-train - INFO - โโโ Loss: 6.2060 |
|
2025-08-30 00:12:40 - pico-train - INFO - โโโ Learning Rate: 2.01e-05 |
|
2025-08-30 00:12:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:12:52 - pico-train - INFO - Step 26050 -- ๐ Training Metrics |
|
2025-08-30 00:12:52 - pico-train - INFO - โโโ Loss: 6.2001 |
|
2025-08-30 00:12:52 - pico-train - INFO - โโโ Learning Rate: 2.00e-05 |
|
2025-08-30 00:12:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:13:05 - pico-train - INFO - Step 26075 -- ๐ Training Metrics |
|
2025-08-30 00:13:05 - pico-train - INFO - โโโ Loss: 6.2546 |
|
2025-08-30 00:13:05 - pico-train - INFO - โโโ Learning Rate: 1.99e-05 |
|
2025-08-30 00:13:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:13:18 - pico-train - INFO - Step 26100 -- ๐ Training Metrics |
|
2025-08-30 00:13:18 - pico-train - INFO - โโโ Loss: 6.1986 |
|
2025-08-30 00:13:18 - pico-train - INFO - โโโ Learning Rate: 1.99e-05 |
|
2025-08-30 00:13:18 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:13:32 - pico-train - INFO - Step 26125 -- ๐ Training Metrics |
|
2025-08-30 00:13:32 - pico-train - INFO - โโโ Loss: 6.2415 |
|
2025-08-30 00:13:32 - pico-train - INFO - โโโ Learning Rate: 1.98e-05 |
|
2025-08-30 00:13:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:13:44 - pico-train - INFO - Step 26150 -- ๐ Training Metrics |
|
2025-08-30 00:13:44 - pico-train - INFO - โโโ Loss: 6.2411 |
|
2025-08-30 00:13:44 - pico-train - INFO - โโโ Learning Rate: 1.98e-05 |
|
2025-08-30 00:13:44 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:13:57 - pico-train - INFO - Step 26175 -- ๐ Training Metrics |
|
2025-08-30 00:13:57 - pico-train - INFO - โโโ Loss: 6.1756 |
|
2025-08-30 00:13:57 - pico-train - INFO - โโโ Learning Rate: 1.97e-05 |
|
2025-08-30 00:13:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:14:10 - pico-train - INFO - Step 26200 -- ๐ Training Metrics |
|
2025-08-30 00:14:10 - pico-train - INFO - โโโ Loss: 6.1444 |
|
2025-08-30 00:14:10 - pico-train - INFO - โโโ Learning Rate: 1.96e-05 |
|
2025-08-30 00:14:10 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:14:22 - pico-train - INFO - Step 26225 -- ๐ Training Metrics |
|
2025-08-30 00:14:22 - pico-train - INFO - โโโ Loss: 6.3335 |
|
2025-08-30 00:14:22 - pico-train - INFO - โโโ Learning Rate: 1.96e-05 |
|
2025-08-30 00:14:22 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:14:35 - pico-train - INFO - Step 26250 -- ๐ Training Metrics |
|
2025-08-30 00:14:35 - pico-train - INFO - โโโ Loss: 6.1491 |
|
2025-08-30 00:14:35 - pico-train - INFO - โโโ Learning Rate: 1.95e-05 |
|
2025-08-30 00:14:35 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:14:48 - pico-train - INFO - Step 26275 -- ๐ Training Metrics |
|
2025-08-30 00:14:48 - pico-train - INFO - โโโ Loss: 6.1959 |
|
2025-08-30 00:14:48 - pico-train - INFO - โโโ Learning Rate: 1.95e-05 |
|
2025-08-30 00:14:48 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:15:00 - pico-train - INFO - Step 26300 -- ๐ Training Metrics |
|
2025-08-30 00:15:00 - pico-train - INFO - โโโ Loss: 6.2494 |
|
2025-08-30 00:15:00 - pico-train - INFO - โโโ Learning Rate: 1.94e-05 |
|
2025-08-30 00:15:00 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:15:13 - pico-train - INFO - Step 26325 -- ๐ Training Metrics |
|
2025-08-30 00:15:13 - pico-train - INFO - โโโ Loss: 6.2893 |
|
2025-08-30 00:15:13 - pico-train - INFO - โโโ Learning Rate: 1.93e-05 |
|
2025-08-30 00:15:13 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:15:26 - pico-train - INFO - Step 26350 -- ๐ Training Metrics |
|
2025-08-30 00:15:26 - pico-train - INFO - โโโ Loss: 6.2732 |
|
2025-08-30 00:15:26 - pico-train - INFO - โโโ Learning Rate: 1.93e-05 |
|
2025-08-30 00:15:26 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:15:38 - pico-train - INFO - Step 26375 -- ๐ Training Metrics |
|
2025-08-30 00:15:38 - pico-train - INFO - โโโ Loss: 6.2804 |
|
2025-08-30 00:15:38 - pico-train - INFO - โโโ Learning Rate: 1.92e-05 |
|
2025-08-30 00:15:38 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:15:51 - pico-train - INFO - Step 26400 -- ๐ Training Metrics |
|
2025-08-30 00:15:51 - pico-train - INFO - โโโ Loss: 6.2117 |
|
2025-08-30 00:15:51 - pico-train - INFO - โโโ Learning Rate: 1.92e-05 |
|
2025-08-30 00:15:51 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:16:04 - pico-train - INFO - Step 26425 -- ๐ Training Metrics |
|
2025-08-30 00:16:04 - pico-train - INFO - โโโ Loss: 6.2055 |
|
2025-08-30 00:16:04 - pico-train - INFO - โโโ Learning Rate: 1.91e-05 |
|
2025-08-30 00:16:04 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:16:17 - pico-train - INFO - Step 26450 -- ๐ Training Metrics |
|
2025-08-30 00:16:17 - pico-train - INFO - โโโ Loss: 6.3085 |
|
2025-08-30 00:16:17 - pico-train - INFO - โโโ Learning Rate: 1.90e-05 |
|
2025-08-30 00:16:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:16:29 - pico-train - INFO - Step 26475 -- ๐ Training Metrics |
|
2025-08-30 00:16:29 - pico-train - INFO - โโโ Loss: 6.1870 |
|
2025-08-30 00:16:29 - pico-train - INFO - โโโ Learning Rate: 1.90e-05 |
|
2025-08-30 00:16:29 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:16:41 - pico-train - INFO - Step 26500 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:18:38 - pico-train - INFO - Step 26500 -- ๐ Evaluation Results |
|
2025-08-30 00:18:38 - pico-train - INFO - โโโ paloma: 7.002764153086805e+25 |
|
2025-08-30 00:18:39 - pico-train - INFO - Step 26500 -- ๐ Training Metrics |
|
2025-08-30 00:18:39 - pico-train - INFO - โโโ Loss: 6.2219 |
|
2025-08-30 00:18:39 - pico-train - INFO - โโโ Learning Rate: 1.89e-05 |
|
2025-08-30 00:18:39 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:18:39 - pico-train - INFO - Step 26500 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:18:54 - pico-train - INFO - Step 26525 -- ๐ Training Metrics |
|
2025-08-30 00:18:54 - pico-train - INFO - โโโ Loss: 6.1945 |
|
2025-08-30 00:18:54 - pico-train - INFO - โโโ Learning Rate: 1.89e-05 |
|
2025-08-30 00:18:54 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:19:07 - pico-train - INFO - Step 26550 -- ๐ Training Metrics |
|
2025-08-30 00:19:07 - pico-train - INFO - โโโ Loss: 6.1917 |
|
2025-08-30 00:19:07 - pico-train - INFO - โโโ Learning Rate: 1.88e-05 |
|
2025-08-30 00:19:07 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:19:20 - pico-train - INFO - Step 26575 -- ๐ Training Metrics |
|
2025-08-30 00:19:20 - pico-train - INFO - โโโ Loss: 6.1611 |
|
2025-08-30 00:19:20 - pico-train - INFO - โโโ Learning Rate: 1.87e-05 |
|
2025-08-30 00:19:20 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:19:32 - pico-train - INFO - Step 26600 -- ๐ Training Metrics |
|
2025-08-30 00:19:32 - pico-train - INFO - โโโ Loss: 6.2254 |
|
2025-08-30 00:19:32 - pico-train - INFO - โโโ Learning Rate: 1.87e-05 |
|
2025-08-30 00:19:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:19:45 - pico-train - INFO - Step 26625 -- ๐ Training Metrics |
|
2025-08-30 00:19:45 - pico-train - INFO - โโโ Loss: 6.2633 |
|
2025-08-30 00:19:45 - pico-train - INFO - โโโ Learning Rate: 1.86e-05 |
|
2025-08-30 00:19:45 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:19:58 - pico-train - INFO - Step 26650 -- ๐ Training Metrics |
|
2025-08-30 00:19:58 - pico-train - INFO - โโโ Loss: 6.2096 |
|
2025-08-30 00:19:58 - pico-train - INFO - โโโ Learning Rate: 1.86e-05 |
|
2025-08-30 00:19:58 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:20:10 - pico-train - INFO - Step 26675 -- ๐ Training Metrics |
|
2025-08-30 00:20:10 - pico-train - INFO - โโโ Loss: 6.2665 |
|
2025-08-30 00:20:10 - pico-train - INFO - โโโ Learning Rate: 1.85e-05 |
|
2025-08-30 00:20:10 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:20:23 - pico-train - INFO - Step 26700 -- ๐ Training Metrics |
|
2025-08-30 00:20:23 - pico-train - INFO - โโโ Loss: 6.2534 |
|
2025-08-30 00:20:23 - pico-train - INFO - โโโ Learning Rate: 1.85e-05 |
|
2025-08-30 00:20:23 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:20:36 - pico-train - INFO - Step 26725 -- ๐ Training Metrics |
|
2025-08-30 00:20:36 - pico-train - INFO - โโโ Loss: 6.2207 |
|
2025-08-30 00:20:36 - pico-train - INFO - โโโ Learning Rate: 1.84e-05 |
|
2025-08-30 00:20:36 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:20:48 - pico-train - INFO - Step 26750 -- ๐ Training Metrics |
|
2025-08-30 00:20:48 - pico-train - INFO - โโโ Loss: 6.2923 |
|
2025-08-30 00:20:48 - pico-train - INFO - โโโ Learning Rate: 1.83e-05 |
|
2025-08-30 00:20:48 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:21:01 - pico-train - INFO - Step 26775 -- ๐ Training Metrics |
|
2025-08-30 00:21:01 - pico-train - INFO - โโโ Loss: 6.2678 |
|
2025-08-30 00:21:01 - pico-train - INFO - โโโ Learning Rate: 1.83e-05 |
|
2025-08-30 00:21:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:21:14 - pico-train - INFO - Step 26800 -- ๐ Training Metrics |
|
2025-08-30 00:21:14 - pico-train - INFO - โโโ Loss: 6.2139 |
|
2025-08-30 00:21:14 - pico-train - INFO - โโโ Learning Rate: 1.82e-05 |
|
2025-08-30 00:21:14 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:21:26 - pico-train - INFO - Step 26825 -- ๐ Training Metrics |
|
2025-08-30 00:21:26 - pico-train - INFO - โโโ Loss: 6.1680 |
|
2025-08-30 00:21:26 - pico-train - INFO - โโโ Learning Rate: 1.82e-05 |
|
2025-08-30 00:21:26 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:21:39 - pico-train - INFO - Step 26850 -- ๐ Training Metrics |
|
2025-08-30 00:21:39 - pico-train - INFO - โโโ Loss: 6.1858 |
|
2025-08-30 00:21:39 - pico-train - INFO - โโโ Learning Rate: 1.81e-05 |
|
2025-08-30 00:21:39 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:21:52 - pico-train - INFO - Step 26875 -- ๐ Training Metrics |
|
2025-08-30 00:21:52 - pico-train - INFO - โโโ Loss: 6.1172 |
|
2025-08-30 00:21:52 - pico-train - INFO - โโโ Learning Rate: 1.80e-05 |
|
2025-08-30 00:21:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:22:05 - pico-train - INFO - Step 26900 -- ๐ Training Metrics |
|
2025-08-30 00:22:05 - pico-train - INFO - โโโ Loss: 6.2332 |
|
2025-08-30 00:22:05 - pico-train - INFO - โโโ Learning Rate: 1.80e-05 |
|
2025-08-30 00:22:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:22:17 - pico-train - INFO - Step 26925 -- ๐ Training Metrics |
|
2025-08-30 00:22:17 - pico-train - INFO - โโโ Loss: 6.2099 |
|
2025-08-30 00:22:17 - pico-train - INFO - โโโ Learning Rate: 1.79e-05 |
|
2025-08-30 00:22:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:22:30 - pico-train - INFO - Step 26950 -- ๐ Training Metrics |
|
2025-08-30 00:22:30 - pico-train - INFO - โโโ Loss: 6.2551 |
|
2025-08-30 00:22:30 - pico-train - INFO - โโโ Learning Rate: 1.79e-05 |
|
2025-08-30 00:22:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:22:43 - pico-train - INFO - Step 26975 -- ๐ Training Metrics |
|
2025-08-30 00:22:43 - pico-train - INFO - โโโ Loss: 6.2033 |
|
2025-08-30 00:22:43 - pico-train - INFO - โโโ Learning Rate: 1.78e-05 |
|
2025-08-30 00:22:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:22:55 - pico-train - INFO - Step 27000 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:24:53 - pico-train - INFO - Step 27000 -- ๐ Evaluation Results |
|
2025-08-30 00:24:53 - pico-train - INFO - โโโ paloma: 7.722641414937935e+25 |
|
2025-08-30 00:24:55 - pico-train - INFO - Step 27000 -- ๐ Training Metrics |
|
2025-08-30 00:24:55 - pico-train - INFO - โโโ Loss: 6.2512 |
|
2025-08-30 00:24:55 - pico-train - INFO - โโโ Learning Rate: 1.77e-05 |
|
2025-08-30 00:24:55 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:24:55 - pico-train - INFO - Step 27000 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:25:09 - pico-train - INFO - Step 27025 -- ๐ Training Metrics |
|
2025-08-30 00:25:09 - pico-train - INFO - โโโ Loss: 6.2686 |
|
2025-08-30 00:25:09 - pico-train - INFO - โโโ Learning Rate: 1.77e-05 |
|
2025-08-30 00:25:09 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:25:22 - pico-train - INFO - Step 27050 -- ๐ Training Metrics |
|
2025-08-30 00:25:22 - pico-train - INFO - โโโ Loss: 6.1854 |
|
2025-08-30 00:25:22 - pico-train - INFO - โโโ Learning Rate: 1.76e-05 |
|
2025-08-30 00:25:22 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:25:35 - pico-train - INFO - Step 27075 -- ๐ Training Metrics |
|
2025-08-30 00:25:35 - pico-train - INFO - โโโ Loss: 6.1974 |
|
2025-08-30 00:25:35 - pico-train - INFO - โโโ Learning Rate: 1.76e-05 |
|
2025-08-30 00:25:35 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:25:47 - pico-train - INFO - Step 27100 -- ๐ Training Metrics |
|
2025-08-30 00:25:47 - pico-train - INFO - โโโ Loss: 6.2597 |
|
2025-08-30 00:25:47 - pico-train - INFO - โโโ Learning Rate: 1.75e-05 |
|
2025-08-30 00:25:47 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:26:00 - pico-train - INFO - Step 27125 -- ๐ Training Metrics |
|
2025-08-30 00:26:00 - pico-train - INFO - โโโ Loss: 6.2280 |
|
2025-08-30 00:26:00 - pico-train - INFO - โโโ Learning Rate: 1.74e-05 |
|
2025-08-30 00:26:00 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:26:13 - pico-train - INFO - Step 27150 -- ๐ Training Metrics |
|
2025-08-30 00:26:13 - pico-train - INFO - โโโ Loss: 6.2126 |
|
2025-08-30 00:26:13 - pico-train - INFO - โโโ Learning Rate: 1.74e-05 |
|
2025-08-30 00:26:13 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:26:26 - pico-train - INFO - Step 27175 -- ๐ Training Metrics |
|
2025-08-30 00:26:26 - pico-train - INFO - โโโ Loss: 6.2233 |
|
2025-08-30 00:26:26 - pico-train - INFO - โโโ Learning Rate: 1.73e-05 |
|
2025-08-30 00:26:26 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:26:38 - pico-train - INFO - Step 27200 -- ๐ Training Metrics |
|
2025-08-30 00:26:38 - pico-train - INFO - โโโ Loss: 6.1393 |
|
2025-08-30 00:26:38 - pico-train - INFO - โโโ Learning Rate: 1.73e-05 |
|
2025-08-30 00:26:38 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:26:51 - pico-train - INFO - Step 27225 -- ๐ Training Metrics |
|
2025-08-30 00:26:51 - pico-train - INFO - โโโ Loss: 6.3226 |
|
2025-08-30 00:26:51 - pico-train - INFO - โโโ Learning Rate: 1.72e-05 |
|
2025-08-30 00:26:51 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:27:03 - pico-train - INFO - Step 27250 -- ๐ Training Metrics |
|
2025-08-30 00:27:03 - pico-train - INFO - โโโ Loss: 6.1570 |
|
2025-08-30 00:27:03 - pico-train - INFO - โโโ Learning Rate: 1.72e-05 |
|
2025-08-30 00:27:03 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:27:16 - pico-train - INFO - Step 27275 -- ๐ Training Metrics |
|
2025-08-30 00:27:16 - pico-train - INFO - โโโ Loss: 6.2252 |
|
2025-08-30 00:27:16 - pico-train - INFO - โโโ Learning Rate: 1.71e-05 |
|
2025-08-30 00:27:16 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:27:29 - pico-train - INFO - Step 27300 -- ๐ Training Metrics |
|
2025-08-30 00:27:29 - pico-train - INFO - โโโ Loss: 6.1647 |
|
2025-08-30 00:27:29 - pico-train - INFO - โโโ Learning Rate: 1.70e-05 |
|
2025-08-30 00:27:29 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:27:41 - pico-train - INFO - Step 27325 -- ๐ Training Metrics |
|
2025-08-30 00:27:41 - pico-train - INFO - โโโ Loss: 6.1219 |
|
2025-08-30 00:27:41 - pico-train - INFO - โโโ Learning Rate: 1.70e-05 |
|
2025-08-30 00:27:41 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:27:54 - pico-train - INFO - Step 27350 -- ๐ Training Metrics |
|
2025-08-30 00:27:54 - pico-train - INFO - โโโ Loss: 6.2250 |
|
2025-08-30 00:27:54 - pico-train - INFO - โโโ Learning Rate: 1.69e-05 |
|
2025-08-30 00:27:54 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:28:06 - pico-train - INFO - Step 27375 -- ๐ Training Metrics |
|
2025-08-30 00:28:06 - pico-train - INFO - โโโ Loss: 6.1883 |
|
2025-08-30 00:28:06 - pico-train - INFO - โโโ Learning Rate: 1.69e-05 |
|
2025-08-30 00:28:06 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:28:19 - pico-train - INFO - Step 27400 -- ๐ Training Metrics |
|
2025-08-30 00:28:19 - pico-train - INFO - โโโ Loss: 6.2074 |
|
2025-08-30 00:28:19 - pico-train - INFO - โโโ Learning Rate: 1.68e-05 |
|
2025-08-30 00:28:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:28:31 - pico-train - INFO - Step 27425 -- ๐ Training Metrics |
|
2025-08-30 00:28:31 - pico-train - INFO - โโโ Loss: 6.1881 |
|
2025-08-30 00:28:31 - pico-train - INFO - โโโ Learning Rate: 1.68e-05 |
|
2025-08-30 00:28:31 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:28:44 - pico-train - INFO - Step 27450 -- ๐ Training Metrics |
|
2025-08-30 00:28:44 - pico-train - INFO - โโโ Loss: 6.1977 |
|
2025-08-30 00:28:44 - pico-train - INFO - โโโ Learning Rate: 1.67e-05 |
|
2025-08-30 00:28:44 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:28:57 - pico-train - INFO - Step 27475 -- ๐ Training Metrics |
|
2025-08-30 00:28:57 - pico-train - INFO - โโโ Loss: 6.2394 |
|
2025-08-30 00:28:57 - pico-train - INFO - โโโ Learning Rate: 1.66e-05 |
|
2025-08-30 00:28:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:29:09 - pico-train - INFO - Step 27500 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:31:15 - pico-train - INFO - Step 27500 -- ๐ Evaluation Results |
|
2025-08-30 00:31:15 - pico-train - INFO - โโโ paloma: 1.0733810806931749e+26 |
|
2025-08-30 00:31:19 - pico-train - INFO - Step 27500 -- ๐ Training Metrics |
|
2025-08-30 00:31:19 - pico-train - INFO - โโโ Loss: 6.2657 |
|
2025-08-30 00:31:19 - pico-train - INFO - โโโ Learning Rate: 1.66e-05 |
|
2025-08-30 00:31:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:31:19 - pico-train - INFO - Step 27500 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:31:34 - pico-train - INFO - Step 27525 -- ๐ Training Metrics |
|
2025-08-30 00:31:34 - pico-train - INFO - โโโ Loss: 6.1848 |
|
2025-08-30 00:31:34 - pico-train - INFO - โโโ Learning Rate: 1.65e-05 |
|
2025-08-30 00:31:34 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:31:46 - pico-train - INFO - Step 27550 -- ๐ Training Metrics |
|
2025-08-30 00:31:46 - pico-train - INFO - โโโ Loss: 6.1677 |
|
2025-08-30 00:31:46 - pico-train - INFO - โโโ Learning Rate: 1.65e-05 |
|
2025-08-30 00:31:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:31:59 - pico-train - INFO - Step 27575 -- ๐ Training Metrics |
|
2025-08-30 00:31:59 - pico-train - INFO - โโโ Loss: 6.2103 |
|
2025-08-30 00:31:59 - pico-train - INFO - โโโ Learning Rate: 1.64e-05 |
|
2025-08-30 00:31:59 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:32:12 - pico-train - INFO - Step 27600 -- ๐ Training Metrics |
|
2025-08-30 00:32:12 - pico-train - INFO - โโโ Loss: 6.2026 |
|
2025-08-30 00:32:12 - pico-train - INFO - โโโ Learning Rate: 1.63e-05 |
|
2025-08-30 00:32:12 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:32:25 - pico-train - INFO - Step 27625 -- ๐ Training Metrics |
|
2025-08-30 00:32:25 - pico-train - INFO - โโโ Loss: 6.1656 |
|
2025-08-30 00:32:25 - pico-train - INFO - โโโ Learning Rate: 1.63e-05 |
|
2025-08-30 00:32:25 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:32:38 - pico-train - INFO - Step 27650 -- ๐ Training Metrics |
|
2025-08-30 00:32:38 - pico-train - INFO - โโโ Loss: 6.1600 |
|
2025-08-30 00:32:38 - pico-train - INFO - โโโ Learning Rate: 1.62e-05 |
|
2025-08-30 00:32:38 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:32:50 - pico-train - INFO - Step 27675 -- ๐ Training Metrics |
|
2025-08-30 00:32:50 - pico-train - INFO - โโโ Loss: 6.2803 |
|
2025-08-30 00:32:50 - pico-train - INFO - โโโ Learning Rate: 1.62e-05 |
|
2025-08-30 00:32:50 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:33:03 - pico-train - INFO - Step 27700 -- ๐ Training Metrics |
|
2025-08-30 00:33:03 - pico-train - INFO - โโโ Loss: 6.2837 |
|
2025-08-30 00:33:03 - pico-train - INFO - โโโ Learning Rate: 1.61e-05 |
|
2025-08-30 00:33:03 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:33:15 - pico-train - INFO - Step 27725 -- ๐ Training Metrics |
|
2025-08-30 00:33:15 - pico-train - INFO - โโโ Loss: 6.1344 |
|
2025-08-30 00:33:15 - pico-train - INFO - โโโ Learning Rate: 1.61e-05 |
|
2025-08-30 00:33:15 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:33:28 - pico-train - INFO - Step 27750 -- ๐ Training Metrics |
|
2025-08-30 00:33:28 - pico-train - INFO - โโโ Loss: 6.2066 |
|
2025-08-30 00:33:28 - pico-train - INFO - โโโ Learning Rate: 1.60e-05 |
|
2025-08-30 00:33:28 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:33:41 - pico-train - INFO - Step 27775 -- ๐ Training Metrics |
|
2025-08-30 00:33:41 - pico-train - INFO - โโโ Loss: 6.1848 |
|
2025-08-30 00:33:41 - pico-train - INFO - โโโ Learning Rate: 1.59e-05 |
|
2025-08-30 00:33:41 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:33:53 - pico-train - INFO - Step 27800 -- ๐ Training Metrics |
|
2025-08-30 00:33:53 - pico-train - INFO - โโโ Loss: 6.2565 |
|
2025-08-30 00:33:53 - pico-train - INFO - โโโ Learning Rate: 1.59e-05 |
|
2025-08-30 00:33:53 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:34:06 - pico-train - INFO - Step 27825 -- ๐ Training Metrics |
|
2025-08-30 00:34:06 - pico-train - INFO - โโโ Loss: 6.2278 |
|
2025-08-30 00:34:06 - pico-train - INFO - โโโ Learning Rate: 1.58e-05 |
|
2025-08-30 00:34:06 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:34:19 - pico-train - INFO - Step 27850 -- ๐ Training Metrics |
|
2025-08-30 00:34:19 - pico-train - INFO - โโโ Loss: 6.2249 |
|
2025-08-30 00:34:19 - pico-train - INFO - โโโ Learning Rate: 1.58e-05 |
|
2025-08-30 00:34:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:34:32 - pico-train - INFO - Step 27875 -- ๐ Training Metrics |
|
2025-08-30 00:34:32 - pico-train - INFO - โโโ Loss: 6.1730 |
|
2025-08-30 00:34:32 - pico-train - INFO - โโโ Learning Rate: 1.57e-05 |
|
2025-08-30 00:34:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:34:44 - pico-train - INFO - Step 27900 -- ๐ Training Metrics |
|
2025-08-30 00:34:44 - pico-train - INFO - โโโ Loss: 6.1503 |
|
2025-08-30 00:34:44 - pico-train - INFO - โโโ Learning Rate: 1.57e-05 |
|
2025-08-30 00:34:44 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:34:57 - pico-train - INFO - Step 27925 -- ๐ Training Metrics |
|
2025-08-30 00:34:57 - pico-train - INFO - โโโ Loss: 6.1955 |
|
2025-08-30 00:34:57 - pico-train - INFO - โโโ Learning Rate: 1.56e-05 |
|
2025-08-30 00:34:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:35:09 - pico-train - INFO - Step 27950 -- ๐ Training Metrics |
|
2025-08-30 00:35:09 - pico-train - INFO - โโโ Loss: 6.1747 |
|
2025-08-30 00:35:09 - pico-train - INFO - โโโ Learning Rate: 1.55e-05 |
|
2025-08-30 00:35:09 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:35:22 - pico-train - INFO - Step 27975 -- ๐ Training Metrics |
|
2025-08-30 00:35:22 - pico-train - INFO - โโโ Loss: 6.2607 |
|
2025-08-30 00:35:22 - pico-train - INFO - โโโ Learning Rate: 1.55e-05 |
|
2025-08-30 00:35:22 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:35:34 - pico-train - INFO - Step 28000 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:37:31 - pico-train - INFO - Step 28000 -- ๐ Evaluation Results |
|
2025-08-30 00:37:31 - pico-train - INFO - โโโ paloma: 1.2438803536426585e+26 |
|
2025-08-30 00:37:34 - pico-train - INFO - Step 28000 -- ๐ Training Metrics |
|
2025-08-30 00:37:34 - pico-train - INFO - โโโ Loss: 6.2990 |
|
2025-08-30 00:37:34 - pico-train - INFO - โโโ Learning Rate: 1.54e-05 |
|
2025-08-30 00:37:34 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:37:34 - pico-train - INFO - Step 28000 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:37:49 - pico-train - INFO - Step 28025 -- ๐ Training Metrics |
|
2025-08-30 00:37:49 - pico-train - INFO - โโโ Loss: 6.1938 |
|
2025-08-30 00:37:49 - pico-train - INFO - โโโ Learning Rate: 1.54e-05 |
|
2025-08-30 00:37:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:38:01 - pico-train - INFO - Step 28050 -- ๐ Training Metrics |
|
2025-08-30 00:38:01 - pico-train - INFO - โโโ Loss: 6.2467 |
|
2025-08-30 00:38:01 - pico-train - INFO - โโโ Learning Rate: 1.53e-05 |
|
2025-08-30 00:38:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:38:14 - pico-train - INFO - Step 28075 -- ๐ Training Metrics |
|
2025-08-30 00:38:14 - pico-train - INFO - โโโ Loss: 6.1609 |
|
2025-08-30 00:38:14 - pico-train - INFO - โโโ Learning Rate: 1.53e-05 |
|
2025-08-30 00:38:14 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:38:26 - pico-train - INFO - Step 28100 -- ๐ Training Metrics |
|
2025-08-30 00:38:26 - pico-train - INFO - โโโ Loss: 6.1691 |
|
2025-08-30 00:38:26 - pico-train - INFO - โโโ Learning Rate: 1.52e-05 |
|
2025-08-30 00:38:26 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:38:39 - pico-train - INFO - Step 28125 -- ๐ Training Metrics |
|
2025-08-30 00:38:39 - pico-train - INFO - โโโ Loss: 6.2517 |
|
2025-08-30 00:38:39 - pico-train - INFO - โโโ Learning Rate: 1.52e-05 |
|
2025-08-30 00:38:39 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:38:52 - pico-train - INFO - Step 28150 -- ๐ Training Metrics |
|
2025-08-30 00:38:52 - pico-train - INFO - โโโ Loss: 6.2758 |
|
2025-08-30 00:38:52 - pico-train - INFO - โโโ Learning Rate: 1.51e-05 |
|
2025-08-30 00:38:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:39:05 - pico-train - INFO - Step 28175 -- ๐ Training Metrics |
|
2025-08-30 00:39:05 - pico-train - INFO - โโโ Loss: 6.2979 |
|
2025-08-30 00:39:05 - pico-train - INFO - โโโ Learning Rate: 1.50e-05 |
|
2025-08-30 00:39:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:39:17 - pico-train - INFO - Step 28200 -- ๐ Training Metrics |
|
2025-08-30 00:39:17 - pico-train - INFO - โโโ Loss: 6.1294 |
|
2025-08-30 00:39:17 - pico-train - INFO - โโโ Learning Rate: 1.50e-05 |
|
2025-08-30 00:39:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:39:30 - pico-train - INFO - Step 28225 -- ๐ Training Metrics |
|
2025-08-30 00:39:30 - pico-train - INFO - โโโ Loss: 6.1557 |
|
2025-08-30 00:39:30 - pico-train - INFO - โโโ Learning Rate: 1.49e-05 |
|
2025-08-30 00:39:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:39:43 - pico-train - INFO - Step 28250 -- ๐ Training Metrics |
|
2025-08-30 00:39:43 - pico-train - INFO - โโโ Loss: 6.2283 |
|
2025-08-30 00:39:43 - pico-train - INFO - โโโ Learning Rate: 1.49e-05 |
|
2025-08-30 00:39:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:39:56 - pico-train - INFO - Step 28275 -- ๐ Training Metrics |
|
2025-08-30 00:39:56 - pico-train - INFO - โโโ Loss: 6.2104 |
|
2025-08-30 00:39:56 - pico-train - INFO - โโโ Learning Rate: 1.48e-05 |
|
2025-08-30 00:39:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:40:08 - pico-train - INFO - Step 28300 -- ๐ Training Metrics |
|
2025-08-30 00:40:08 - pico-train - INFO - โโโ Loss: 6.2633 |
|
2025-08-30 00:40:08 - pico-train - INFO - โโโ Learning Rate: 1.48e-05 |
|
2025-08-30 00:40:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:40:21 - pico-train - INFO - Step 28325 -- ๐ Training Metrics |
|
2025-08-30 00:40:21 - pico-train - INFO - โโโ Loss: 6.1844 |
|
2025-08-30 00:40:21 - pico-train - INFO - โโโ Learning Rate: 1.47e-05 |
|
2025-08-30 00:40:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:40:34 - pico-train - INFO - Step 28350 -- ๐ Training Metrics |
|
2025-08-30 00:40:34 - pico-train - INFO - โโโ Loss: 6.1349 |
|
2025-08-30 00:40:34 - pico-train - INFO - โโโ Learning Rate: 1.46e-05 |
|
2025-08-30 00:40:34 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:40:46 - pico-train - INFO - Step 28375 -- ๐ Training Metrics |
|
2025-08-30 00:40:46 - pico-train - INFO - โโโ Loss: 6.2638 |
|
2025-08-30 00:40:46 - pico-train - INFO - โโโ Learning Rate: 1.46e-05 |
|
2025-08-30 00:40:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:40:59 - pico-train - INFO - Step 28400 -- ๐ Training Metrics |
|
2025-08-30 00:40:59 - pico-train - INFO - โโโ Loss: 6.1960 |
|
2025-08-30 00:40:59 - pico-train - INFO - โโโ Learning Rate: 1.45e-05 |
|
2025-08-30 00:40:59 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:41:11 - pico-train - INFO - Step 28425 -- ๐ Training Metrics |
|
2025-08-30 00:41:11 - pico-train - INFO - โโโ Loss: 6.2582 |
|
2025-08-30 00:41:11 - pico-train - INFO - โโโ Learning Rate: 1.45e-05 |
|
2025-08-30 00:41:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:41:24 - pico-train - INFO - Step 28450 -- ๐ Training Metrics |
|
2025-08-30 00:41:24 - pico-train - INFO - โโโ Loss: 6.2071 |
|
2025-08-30 00:41:24 - pico-train - INFO - โโโ Learning Rate: 1.44e-05 |
|
2025-08-30 00:41:24 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:41:37 - pico-train - INFO - Step 28475 -- ๐ Training Metrics |
|
2025-08-30 00:41:37 - pico-train - INFO - โโโ Loss: 6.2106 |
|
2025-08-30 00:41:37 - pico-train - INFO - โโโ Learning Rate: 1.44e-05 |
|
2025-08-30 00:41:37 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:41:49 - pico-train - INFO - Step 28500 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:43:48 - pico-train - INFO - Step 28500 -- ๐ Evaluation Results |
|
2025-08-30 00:43:48 - pico-train - INFO - โโโ paloma: 1.3653691992013197e+26 |
|
2025-08-30 00:43:51 - pico-train - INFO - Step 28500 -- ๐ Training Metrics |
|
2025-08-30 00:43:51 - pico-train - INFO - โโโ Loss: 6.2141 |
|
2025-08-30 00:43:51 - pico-train - INFO - โโโ Learning Rate: 1.43e-05 |
|
2025-08-30 00:43:51 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:43:51 - pico-train - INFO - Step 28500 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:44:06 - pico-train - INFO - Step 28525 -- ๐ Training Metrics |
|
2025-08-30 00:44:06 - pico-train - INFO - โโโ Loss: 6.1702 |
|
2025-08-30 00:44:06 - pico-train - INFO - โโโ Learning Rate: 1.43e-05 |
|
2025-08-30 00:44:06 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:44:19 - pico-train - INFO - Step 28550 -- ๐ Training Metrics |
|
2025-08-30 00:44:19 - pico-train - INFO - โโโ Loss: 6.1650 |
|
2025-08-30 00:44:19 - pico-train - INFO - โโโ Learning Rate: 1.42e-05 |
|
2025-08-30 00:44:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:44:31 - pico-train - INFO - Step 28575 -- ๐ Training Metrics |
|
2025-08-30 00:44:31 - pico-train - INFO - โโโ Loss: 6.1357 |
|
2025-08-30 00:44:31 - pico-train - INFO - โโโ Learning Rate: 1.41e-05 |
|
2025-08-30 00:44:31 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:44:44 - pico-train - INFO - Step 28600 -- ๐ Training Metrics |
|
2025-08-30 00:44:44 - pico-train - INFO - โโโ Loss: 6.2757 |
|
2025-08-30 00:44:44 - pico-train - INFO - โโโ Learning Rate: 1.41e-05 |
|
2025-08-30 00:44:44 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:44:57 - pico-train - INFO - Step 28625 -- ๐ Training Metrics |
|
2025-08-30 00:44:57 - pico-train - INFO - โโโ Loss: 6.1983 |
|
2025-08-30 00:44:57 - pico-train - INFO - โโโ Learning Rate: 1.40e-05 |
|
2025-08-30 00:44:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:45:09 - pico-train - INFO - Step 28650 -- ๐ Training Metrics |
|
2025-08-30 00:45:09 - pico-train - INFO - โโโ Loss: 6.1417 |
|
2025-08-30 00:45:09 - pico-train - INFO - โโโ Learning Rate: 1.40e-05 |
|
2025-08-30 00:45:09 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:45:22 - pico-train - INFO - Step 28675 -- ๐ Training Metrics |
|
2025-08-30 00:45:22 - pico-train - INFO - โโโ Loss: 6.1524 |
|
2025-08-30 00:45:22 - pico-train - INFO - โโโ Learning Rate: 1.39e-05 |
|
2025-08-30 00:45:22 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:45:34 - pico-train - INFO - Step 28700 -- ๐ Training Metrics |
|
2025-08-30 00:45:34 - pico-train - INFO - โโโ Loss: 6.2928 |
|
2025-08-30 00:45:34 - pico-train - INFO - โโโ Learning Rate: 1.39e-05 |
|
2025-08-30 00:45:34 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:45:47 - pico-train - INFO - Step 28725 -- ๐ Training Metrics |
|
2025-08-30 00:45:47 - pico-train - INFO - โโโ Loss: 6.1187 |
|
2025-08-30 00:45:47 - pico-train - INFO - โโโ Learning Rate: 1.38e-05 |
|
2025-08-30 00:45:47 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:46:00 - pico-train - INFO - Step 28750 -- ๐ Training Metrics |
|
2025-08-30 00:46:00 - pico-train - INFO - โโโ Loss: 6.1926 |
|
2025-08-30 00:46:00 - pico-train - INFO - โโโ Learning Rate: 1.38e-05 |
|
2025-08-30 00:46:00 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:46:12 - pico-train - INFO - Step 28775 -- ๐ Training Metrics |
|
2025-08-30 00:46:12 - pico-train - INFO - โโโ Loss: 6.1810 |
|
2025-08-30 00:46:12 - pico-train - INFO - โโโ Learning Rate: 1.37e-05 |
|
2025-08-30 00:46:12 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:46:25 - pico-train - INFO - Step 28800 -- ๐ Training Metrics |
|
2025-08-30 00:46:25 - pico-train - INFO - โโโ Loss: 6.1615 |
|
2025-08-30 00:46:25 - pico-train - INFO - โโโ Learning Rate: 1.37e-05 |
|
2025-08-30 00:46:25 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:46:37 - pico-train - INFO - Step 28825 -- ๐ Training Metrics |
|
2025-08-30 00:46:37 - pico-train - INFO - โโโ Loss: 6.1871 |
|
2025-08-30 00:46:37 - pico-train - INFO - โโโ Learning Rate: 1.36e-05 |
|
2025-08-30 00:46:37 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:46:50 - pico-train - INFO - Step 28850 -- ๐ Training Metrics |
|
2025-08-30 00:46:50 - pico-train - INFO - โโโ Loss: 6.1287 |
|
2025-08-30 00:46:50 - pico-train - INFO - โโโ Learning Rate: 1.35e-05 |
|
2025-08-30 00:46:50 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:47:02 - pico-train - INFO - Step 28875 -- ๐ Training Metrics |
|
2025-08-30 00:47:02 - pico-train - INFO - โโโ Loss: 6.1008 |
|
2025-08-30 00:47:02 - pico-train - INFO - โโโ Learning Rate: 1.35e-05 |
|
2025-08-30 00:47:02 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:47:15 - pico-train - INFO - Step 28900 -- ๐ Training Metrics |
|
2025-08-30 00:47:15 - pico-train - INFO - โโโ Loss: 6.2167 |
|
2025-08-30 00:47:15 - pico-train - INFO - โโโ Learning Rate: 1.34e-05 |
|
2025-08-30 00:47:15 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:47:28 - pico-train - INFO - Step 28925 -- ๐ Training Metrics |
|
2025-08-30 00:47:28 - pico-train - INFO - โโโ Loss: 6.1657 |
|
2025-08-30 00:47:28 - pico-train - INFO - โโโ Learning Rate: 1.34e-05 |
|
2025-08-30 00:47:28 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:47:40 - pico-train - INFO - Step 28950 -- ๐ Training Metrics |
|
2025-08-30 00:47:40 - pico-train - INFO - โโโ Loss: 6.2003 |
|
2025-08-30 00:47:40 - pico-train - INFO - โโโ Learning Rate: 1.33e-05 |
|
2025-08-30 00:47:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:47:53 - pico-train - INFO - Step 28975 -- ๐ Training Metrics |
|
2025-08-30 00:47:53 - pico-train - INFO - โโโ Loss: 6.2189 |
|
2025-08-30 00:47:53 - pico-train - INFO - โโโ Learning Rate: 1.33e-05 |
|
2025-08-30 00:47:53 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:48:05 - pico-train - INFO - Step 29000 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:50:04 - pico-train - INFO - Step 29000 -- ๐ Evaluation Results |
|
2025-08-30 00:50:04 - pico-train - INFO - โโโ paloma: 1.4417132887690374e+26 |
|
2025-08-30 00:50:06 - pico-train - INFO - Step 29000 -- ๐ Training Metrics |
|
2025-08-30 00:50:06 - pico-train - INFO - โโโ Loss: 6.1592 |
|
2025-08-30 00:50:06 - pico-train - INFO - โโโ Learning Rate: 1.32e-05 |
|
2025-08-30 00:50:06 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:50:06 - pico-train - INFO - Step 29000 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:50:22 - pico-train - INFO - Step 29025 -- ๐ Training Metrics |
|
2025-08-30 00:50:22 - pico-train - INFO - โโโ Loss: 6.2133 |
|
2025-08-30 00:50:22 - pico-train - INFO - โโโ Learning Rate: 1.32e-05 |
|
2025-08-30 00:50:22 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:50:35 - pico-train - INFO - Step 29050 -- ๐ Training Metrics |
|
2025-08-30 00:50:35 - pico-train - INFO - โโโ Loss: 6.1536 |
|
2025-08-30 00:50:35 - pico-train - INFO - โโโ Learning Rate: 1.31e-05 |
|
2025-08-30 00:50:35 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:50:47 - pico-train - INFO - Step 29075 -- ๐ Training Metrics |
|
2025-08-30 00:50:47 - pico-train - INFO - โโโ Loss: 6.1872 |
|
2025-08-30 00:50:47 - pico-train - INFO - โโโ Learning Rate: 1.31e-05 |
|
2025-08-30 00:50:47 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:51:00 - pico-train - INFO - Step 29100 -- ๐ Training Metrics |
|
2025-08-30 00:51:00 - pico-train - INFO - โโโ Loss: 6.1469 |
|
2025-08-30 00:51:00 - pico-train - INFO - โโโ Learning Rate: 1.30e-05 |
|
2025-08-30 00:51:00 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:51:13 - pico-train - INFO - Step 29125 -- ๐ Training Metrics |
|
2025-08-30 00:51:13 - pico-train - INFO - โโโ Loss: 6.2113 |
|
2025-08-30 00:51:13 - pico-train - INFO - โโโ Learning Rate: 1.29e-05 |
|
2025-08-30 00:51:13 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:51:26 - pico-train - INFO - Step 29150 -- ๐ Training Metrics |
|
2025-08-30 00:51:26 - pico-train - INFO - โโโ Loss: 6.1172 |
|
2025-08-30 00:51:26 - pico-train - INFO - โโโ Learning Rate: 1.29e-05 |
|
2025-08-30 00:51:26 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:51:38 - pico-train - INFO - Step 29175 -- ๐ Training Metrics |
|
2025-08-30 00:51:38 - pico-train - INFO - โโโ Loss: 6.1350 |
|
2025-08-30 00:51:38 - pico-train - INFO - โโโ Learning Rate: 1.28e-05 |
|
2025-08-30 00:51:38 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:51:51 - pico-train - INFO - Step 29200 -- ๐ Training Metrics |
|
2025-08-30 00:51:51 - pico-train - INFO - โโโ Loss: 6.2083 |
|
2025-08-30 00:51:51 - pico-train - INFO - โโโ Learning Rate: 1.28e-05 |
|
2025-08-30 00:51:51 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:52:03 - pico-train - INFO - Step 29225 -- ๐ Training Metrics |
|
2025-08-30 00:52:03 - pico-train - INFO - โโโ Loss: 6.3192 |
|
2025-08-30 00:52:03 - pico-train - INFO - โโโ Learning Rate: 1.27e-05 |
|
2025-08-30 00:52:03 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:52:16 - pico-train - INFO - Step 29250 -- ๐ Training Metrics |
|
2025-08-30 00:52:16 - pico-train - INFO - โโโ Loss: 6.1807 |
|
2025-08-30 00:52:16 - pico-train - INFO - โโโ Learning Rate: 1.27e-05 |
|
2025-08-30 00:52:16 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:52:29 - pico-train - INFO - Step 29275 -- ๐ Training Metrics |
|
2025-08-30 00:52:29 - pico-train - INFO - โโโ Loss: 6.1737 |
|
2025-08-30 00:52:29 - pico-train - INFO - โโโ Learning Rate: 1.26e-05 |
|
2025-08-30 00:52:29 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:52:41 - pico-train - INFO - Step 29300 -- ๐ Training Metrics |
|
2025-08-30 00:52:41 - pico-train - INFO - โโโ Loss: 6.0887 |
|
2025-08-30 00:52:41 - pico-train - INFO - โโโ Learning Rate: 1.26e-05 |
|
2025-08-30 00:52:41 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:52:54 - pico-train - INFO - Step 29325 -- ๐ Training Metrics |
|
2025-08-30 00:52:54 - pico-train - INFO - โโโ Loss: 6.2875 |
|
2025-08-30 00:52:54 - pico-train - INFO - โโโ Learning Rate: 1.25e-05 |
|
2025-08-30 00:52:54 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:53:06 - pico-train - INFO - Step 29350 -- ๐ Training Metrics |
|
2025-08-30 00:53:06 - pico-train - INFO - โโโ Loss: 6.2426 |
|
2025-08-30 00:53:06 - pico-train - INFO - โโโ Learning Rate: 1.25e-05 |
|
2025-08-30 00:53:06 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:53:19 - pico-train - INFO - Step 29375 -- ๐ Training Metrics |
|
2025-08-30 00:53:19 - pico-train - INFO - โโโ Loss: 6.1058 |
|
2025-08-30 00:53:19 - pico-train - INFO - โโโ Learning Rate: 1.24e-05 |
|
2025-08-30 00:53:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:53:32 - pico-train - INFO - Step 29400 -- ๐ Training Metrics |
|
2025-08-30 00:53:32 - pico-train - INFO - โโโ Loss: 6.1215 |
|
2025-08-30 00:53:32 - pico-train - INFO - โโโ Learning Rate: 1.24e-05 |
|
2025-08-30 00:53:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:53:44 - pico-train - INFO - Step 29425 -- ๐ Training Metrics |
|
2025-08-30 00:53:44 - pico-train - INFO - โโโ Loss: 6.2543 |
|
2025-08-30 00:53:44 - pico-train - INFO - โโโ Learning Rate: 1.23e-05 |
|
2025-08-30 00:53:44 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:53:57 - pico-train - INFO - Step 29450 -- ๐ Training Metrics |
|
2025-08-30 00:53:57 - pico-train - INFO - โโโ Loss: 6.1715 |
|
2025-08-30 00:53:57 - pico-train - INFO - โโโ Learning Rate: 1.23e-05 |
|
2025-08-30 00:53:57 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:54:09 - pico-train - INFO - Step 29475 -- ๐ Training Metrics |
|
2025-08-30 00:54:09 - pico-train - INFO - โโโ Loss: 6.1795 |
|
2025-08-30 00:54:09 - pico-train - INFO - โโโ Learning Rate: 1.22e-05 |
|
2025-08-30 00:54:09 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:54:21 - pico-train - INFO - Step 29500 -- ๐พ Saving Checkpoint |
|
2025-08-30 00:56:18 - pico-train - INFO - Step 29500 -- ๐ Evaluation Results |
|
2025-08-30 00:56:18 - pico-train - INFO - โโโ paloma: 1.7095266725777237e+26 |
|
2025-08-30 00:56:21 - pico-train - INFO - Step 29500 -- ๐ Training Metrics |
|
2025-08-30 00:56:21 - pico-train - INFO - โโโ Loss: 6.1663 |
|
2025-08-30 00:56:21 - pico-train - INFO - โโโ Learning Rate: 1.21e-05 |
|
2025-08-30 00:56:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:56:21 - pico-train - INFO - Step 29500 -- ๐ Saving Learning Dynamics |
|
2025-08-30 00:56:36 - pico-train - INFO - Step 29525 -- ๐ Training Metrics |
|
2025-08-30 00:56:36 - pico-train - INFO - โโโ Loss: 6.1521 |
|
2025-08-30 00:56:36 - pico-train - INFO - โโโ Learning Rate: 1.21e-05 |
|
2025-08-30 00:56:36 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:56:48 - pico-train - INFO - Step 29550 -- ๐ Training Metrics |
|
2025-08-30 00:56:48 - pico-train - INFO - โโโ Loss: 6.0880 |
|
2025-08-30 00:56:48 - pico-train - INFO - โโโ Learning Rate: 1.20e-05 |
|
2025-08-30 00:56:48 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:57:01 - pico-train - INFO - Step 29575 -- ๐ Training Metrics |
|
2025-08-30 00:57:01 - pico-train - INFO - โโโ Loss: 6.1806 |
|
2025-08-30 00:57:01 - pico-train - INFO - โโโ Learning Rate: 1.20e-05 |
|
2025-08-30 00:57:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:57:13 - pico-train - INFO - Step 29600 -- ๐ Training Metrics |
|
2025-08-30 00:57:13 - pico-train - INFO - โโโ Loss: 6.3067 |
|
2025-08-30 00:57:13 - pico-train - INFO - โโโ Learning Rate: 1.19e-05 |
|
2025-08-30 00:57:13 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:57:27 - pico-train - INFO - Step 29625 -- ๐ Training Metrics |
|
2025-08-30 00:57:27 - pico-train - INFO - โโโ Loss: 6.2586 |
|
2025-08-30 00:57:27 - pico-train - INFO - โโโ Learning Rate: 1.19e-05 |
|
2025-08-30 00:57:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:57:40 - pico-train - INFO - Step 29650 -- ๐ Training Metrics |
|
2025-08-30 00:57:40 - pico-train - INFO - โโโ Loss: 6.1478 |
|
2025-08-30 00:57:40 - pico-train - INFO - โโโ Learning Rate: 1.18e-05 |
|
2025-08-30 00:57:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:57:52 - pico-train - INFO - Step 29675 -- ๐ Training Metrics |
|
2025-08-30 00:57:52 - pico-train - INFO - โโโ Loss: 6.1101 |
|
2025-08-30 00:57:52 - pico-train - INFO - โโโ Learning Rate: 1.18e-05 |
|
2025-08-30 00:57:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:58:05 - pico-train - INFO - Step 29700 -- ๐ Training Metrics |
|
2025-08-30 00:58:05 - pico-train - INFO - โโโ Loss: 6.1873 |
|
2025-08-30 00:58:05 - pico-train - INFO - โโโ Learning Rate: 1.17e-05 |
|
2025-08-30 00:58:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:58:17 - pico-train - INFO - Step 29725 -- ๐ Training Metrics |
|
2025-08-30 00:58:17 - pico-train - INFO - โโโ Loss: 6.0894 |
|
2025-08-30 00:58:17 - pico-train - INFO - โโโ Learning Rate: 1.17e-05 |
|
2025-08-30 00:58:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:58:30 - pico-train - INFO - Step 29750 -- ๐ Training Metrics |
|
2025-08-30 00:58:30 - pico-train - INFO - โโโ Loss: 6.1793 |
|
2025-08-30 00:58:30 - pico-train - INFO - โโโ Learning Rate: 1.16e-05 |
|
2025-08-30 00:58:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:58:42 - pico-train - INFO - Step 29775 -- ๐ Training Metrics |
|
2025-08-30 00:58:42 - pico-train - INFO - โโโ Loss: 6.1858 |
|
2025-08-30 00:58:42 - pico-train - INFO - โโโ Learning Rate: 1.16e-05 |
|
2025-08-30 00:58:42 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:58:55 - pico-train - INFO - Step 29800 -- ๐ Training Metrics |
|
2025-08-30 00:58:55 - pico-train - INFO - โโโ Loss: 6.1729 |
|
2025-08-30 00:58:55 - pico-train - INFO - โโโ Learning Rate: 1.15e-05 |
|
2025-08-30 00:58:55 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:59:07 - pico-train - INFO - Step 29825 -- ๐ Training Metrics |
|
2025-08-30 00:59:07 - pico-train - INFO - โโโ Loss: 6.1856 |
|
2025-08-30 00:59:07 - pico-train - INFO - โโโ Learning Rate: 1.15e-05 |
|
2025-08-30 00:59:07 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:59:20 - pico-train - INFO - Step 29850 -- ๐ Training Metrics |
|
2025-08-30 00:59:20 - pico-train - INFO - โโโ Loss: 6.1591 |
|
2025-08-30 00:59:20 - pico-train - INFO - โโโ Learning Rate: 1.14e-05 |
|
2025-08-30 00:59:20 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:59:33 - pico-train - INFO - Step 29875 -- ๐ Training Metrics |
|
2025-08-30 00:59:33 - pico-train - INFO - โโโ Loss: 6.2964 |
|
2025-08-30 00:59:33 - pico-train - INFO - โโโ Learning Rate: 1.14e-05 |
|
2025-08-30 00:59:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:59:45 - pico-train - INFO - Step 29900 -- ๐ Training Metrics |
|
2025-08-30 00:59:45 - pico-train - INFO - โโโ Loss: 6.2506 |
|
2025-08-30 00:59:45 - pico-train - INFO - โโโ Learning Rate: 1.13e-05 |
|
2025-08-30 00:59:45 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 00:59:58 - pico-train - INFO - Step 29925 -- ๐ Training Metrics |
|
2025-08-30 00:59:58 - pico-train - INFO - โโโ Loss: 6.1630 |
|
2025-08-30 00:59:58 - pico-train - INFO - โโโ Learning Rate: 1.13e-05 |
|
2025-08-30 00:59:58 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:00:11 - pico-train - INFO - Step 29950 -- ๐ Training Metrics |
|
2025-08-30 01:00:11 - pico-train - INFO - โโโ Loss: 6.2033 |
|
2025-08-30 01:00:11 - pico-train - INFO - โโโ Learning Rate: 1.12e-05 |
|
2025-08-30 01:00:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:00:23 - pico-train - INFO - Step 29975 -- ๐ Training Metrics |
|
2025-08-30 01:00:23 - pico-train - INFO - โโโ Loss: 6.0846 |
|
2025-08-30 01:00:23 - pico-train - INFO - โโโ Learning Rate: 1.12e-05 |
|
2025-08-30 01:00:23 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:00:35 - pico-train - INFO - Step 30000 -- ๐พ Saving Checkpoint |
|
2025-08-30 01:02:29 - pico-train - INFO - Step 30000 -- ๐ Evaluation Results |
|
2025-08-30 01:02:29 - pico-train - INFO - โโโ paloma: 2.0463060977945524e+26 |
|
2025-08-30 01:02:31 - pico-train - INFO - Step 30000 -- ๐ Training Metrics |
|
2025-08-30 01:02:31 - pico-train - INFO - โโโ Loss: 6.1682 |
|
2025-08-30 01:02:31 - pico-train - INFO - โโโ Learning Rate: 1.11e-05 |
|
2025-08-30 01:02:31 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:02:31 - pico-train - INFO - Step 30000 -- ๐ Saving Learning Dynamics |
|
2025-08-30 01:02:46 - pico-train - INFO - Step 30025 -- ๐ Training Metrics |
|
2025-08-30 01:02:46 - pico-train - INFO - โโโ Loss: 6.2143 |
|
2025-08-30 01:02:46 - pico-train - INFO - โโโ Learning Rate: 1.11e-05 |
|
2025-08-30 01:02:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:02:58 - pico-train - INFO - Step 30050 -- ๐ Training Metrics |
|
2025-08-30 01:02:58 - pico-train - INFO - โโโ Loss: 6.1476 |
|
2025-08-30 01:02:58 - pico-train - INFO - โโโ Learning Rate: 1.10e-05 |
|
2025-08-30 01:02:58 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:03:11 - pico-train - INFO - Step 30075 -- ๐ Training Metrics |
|
2025-08-30 01:03:11 - pico-train - INFO - โโโ Loss: 6.1530 |
|
2025-08-30 01:03:11 - pico-train - INFO - โโโ Learning Rate: 1.10e-05 |
|
2025-08-30 01:03:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:03:23 - pico-train - INFO - Step 30100 -- ๐ Training Metrics |
|
2025-08-30 01:03:23 - pico-train - INFO - โโโ Loss: 6.1518 |
|
2025-08-30 01:03:23 - pico-train - INFO - โโโ Learning Rate: 1.09e-05 |
|
2025-08-30 01:03:23 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:03:37 - pico-train - INFO - Step 30125 -- ๐ Training Metrics |
|
2025-08-30 01:03:37 - pico-train - INFO - โโโ Loss: 6.1752 |
|
2025-08-30 01:03:37 - pico-train - INFO - โโโ Learning Rate: 1.09e-05 |
|
2025-08-30 01:03:37 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:03:49 - pico-train - INFO - Step 30150 -- ๐ Training Metrics |
|
2025-08-30 01:03:49 - pico-train - INFO - โโโ Loss: 6.2413 |
|
2025-08-30 01:03:49 - pico-train - INFO - โโโ Learning Rate: 1.08e-05 |
|
2025-08-30 01:03:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:04:02 - pico-train - INFO - Step 30175 -- ๐ Training Metrics |
|
2025-08-30 01:04:02 - pico-train - INFO - โโโ Loss: 6.2624 |
|
2025-08-30 01:04:02 - pico-train - INFO - โโโ Learning Rate: 1.08e-05 |
|
2025-08-30 01:04:02 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:04:14 - pico-train - INFO - Step 30200 -- ๐ Training Metrics |
|
2025-08-30 01:04:14 - pico-train - INFO - โโโ Loss: 6.2339 |
|
2025-08-30 01:04:14 - pico-train - INFO - โโโ Learning Rate: 1.07e-05 |
|
2025-08-30 01:04:14 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:04:27 - pico-train - INFO - Step 30225 -- ๐ Training Metrics |
|
2025-08-30 01:04:27 - pico-train - INFO - โโโ Loss: 6.1617 |
|
2025-08-30 01:04:27 - pico-train - INFO - โโโ Learning Rate: 1.07e-05 |
|
2025-08-30 01:04:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:04:40 - pico-train - INFO - Step 30250 -- ๐ Training Metrics |
|
2025-08-30 01:04:40 - pico-train - INFO - โโโ Loss: 6.1225 |
|
2025-08-30 01:04:40 - pico-train - INFO - โโโ Learning Rate: 1.06e-05 |
|
2025-08-30 01:04:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:04:52 - pico-train - INFO - Step 30275 -- ๐ Training Metrics |
|
2025-08-30 01:04:52 - pico-train - INFO - โโโ Loss: 6.2344 |
|
2025-08-30 01:04:52 - pico-train - INFO - โโโ Learning Rate: 1.06e-05 |
|
2025-08-30 01:04:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:05:05 - pico-train - INFO - Step 30300 -- ๐ Training Metrics |
|
2025-08-30 01:05:05 - pico-train - INFO - โโโ Loss: 6.1970 |
|
2025-08-30 01:05:05 - pico-train - INFO - โโโ Learning Rate: 1.05e-05 |
|
2025-08-30 01:05:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:05:18 - pico-train - INFO - Step 30325 -- ๐ Training Metrics |
|
2025-08-30 01:05:18 - pico-train - INFO - โโโ Loss: 6.1580 |
|
2025-08-30 01:05:18 - pico-train - INFO - โโโ Learning Rate: 1.05e-05 |
|
2025-08-30 01:05:18 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:05:30 - pico-train - INFO - Step 30350 -- ๐ Training Metrics |
|
2025-08-30 01:05:30 - pico-train - INFO - โโโ Loss: 6.2210 |
|
2025-08-30 01:05:30 - pico-train - INFO - โโโ Learning Rate: 1.04e-05 |
|
2025-08-30 01:05:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:05:43 - pico-train - INFO - Step 30375 -- ๐ Training Metrics |
|
2025-08-30 01:05:43 - pico-train - INFO - โโโ Loss: 6.1991 |
|
2025-08-30 01:05:43 - pico-train - INFO - โโโ Learning Rate: 1.04e-05 |
|
2025-08-30 01:05:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:05:56 - pico-train - INFO - Step 30400 -- ๐ Training Metrics |
|
2025-08-30 01:05:56 - pico-train - INFO - โโโ Loss: 6.2500 |
|
2025-08-30 01:05:56 - pico-train - INFO - โโโ Learning Rate: 1.03e-05 |
|
2025-08-30 01:05:56 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:06:08 - pico-train - INFO - Step 30425 -- ๐ Training Metrics |
|
2025-08-30 01:06:08 - pico-train - INFO - โโโ Loss: 6.2252 |
|
2025-08-30 01:06:08 - pico-train - INFO - โโโ Learning Rate: 1.03e-05 |
|
2025-08-30 01:06:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:06:21 - pico-train - INFO - Step 30450 -- ๐ Training Metrics |
|
2025-08-30 01:06:21 - pico-train - INFO - โโโ Loss: 6.2010 |
|
2025-08-30 01:06:21 - pico-train - INFO - โโโ Learning Rate: 1.02e-05 |
|
2025-08-30 01:06:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:06:33 - pico-train - INFO - Step 30475 -- ๐ Training Metrics |
|
2025-08-30 01:06:33 - pico-train - INFO - โโโ Loss: 6.1309 |
|
2025-08-30 01:06:33 - pico-train - INFO - โโโ Learning Rate: 1.02e-05 |
|
2025-08-30 01:06:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:06:46 - pico-train - INFO - Step 30500 -- ๐พ Saving Checkpoint |
|
2025-08-30 01:08:46 - pico-train - INFO - Step 30500 -- ๐ Evaluation Results |
|
2025-08-30 01:08:46 - pico-train - INFO - โโโ paloma: 2.2542988490213366e+26 |
|
2025-08-30 01:08:49 - pico-train - INFO - Step 30500 -- ๐ Training Metrics |
|
2025-08-30 01:08:49 - pico-train - INFO - โโโ Loss: 6.1853 |
|
2025-08-30 01:08:49 - pico-train - INFO - โโโ Learning Rate: 1.01e-05 |
|
2025-08-30 01:08:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:08:49 - pico-train - INFO - Step 30500 -- ๐ Saving Learning Dynamics |
|
2025-08-30 01:09:04 - pico-train - INFO - Step 30525 -- ๐ Training Metrics |
|
2025-08-30 01:09:04 - pico-train - INFO - โโโ Loss: 6.1358 |
|
2025-08-30 01:09:04 - pico-train - INFO - โโโ Learning Rate: 1.01e-05 |
|
2025-08-30 01:09:04 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:09:17 - pico-train - INFO - Step 30550 -- ๐ Training Metrics |
|
2025-08-30 01:09:17 - pico-train - INFO - โโโ Loss: 6.1170 |
|
2025-08-30 01:09:17 - pico-train - INFO - โโโ Learning Rate: 1.00e-05 |
|
2025-08-30 01:09:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:09:29 - pico-train - INFO - Step 30575 -- ๐ Training Metrics |
|
2025-08-30 01:09:29 - pico-train - INFO - โโโ Loss: 6.1497 |
|
2025-08-30 01:09:29 - pico-train - INFO - โโโ Learning Rate: 9.96e-06 |
|
2025-08-30 01:09:29 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:09:42 - pico-train - INFO - Step 30600 -- ๐ Training Metrics |
|
2025-08-30 01:09:42 - pico-train - INFO - โโโ Loss: 6.2103 |
|
2025-08-30 01:09:42 - pico-train - INFO - โโโ Learning Rate: 9.91e-06 |
|
2025-08-30 01:09:42 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:09:54 - pico-train - INFO - Step 30625 -- ๐ Training Metrics |
|
2025-08-30 01:09:54 - pico-train - INFO - โโโ Loss: 6.1137 |
|
2025-08-30 01:09:54 - pico-train - INFO - โโโ Learning Rate: 9.86e-06 |
|
2025-08-30 01:09:54 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:10:07 - pico-train - INFO - Step 30650 -- ๐ Training Metrics |
|
2025-08-30 01:10:07 - pico-train - INFO - โโโ Loss: 6.1631 |
|
2025-08-30 01:10:07 - pico-train - INFO - โโโ Learning Rate: 9.81e-06 |
|
2025-08-30 01:10:07 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:10:19 - pico-train - INFO - Step 30675 -- ๐ Training Metrics |
|
2025-08-30 01:10:19 - pico-train - INFO - โโโ Loss: 6.1651 |
|
2025-08-30 01:10:19 - pico-train - INFO - โโโ Learning Rate: 9.76e-06 |
|
2025-08-30 01:10:19 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:10:32 - pico-train - INFO - Step 30700 -- ๐ Training Metrics |
|
2025-08-30 01:10:32 - pico-train - INFO - โโโ Loss: 6.1969 |
|
2025-08-30 01:10:32 - pico-train - INFO - โโโ Learning Rate: 9.72e-06 |
|
2025-08-30 01:10:32 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:10:45 - pico-train - INFO - Step 30725 -- ๐ Training Metrics |
|
2025-08-30 01:10:45 - pico-train - INFO - โโโ Loss: 6.1007 |
|
2025-08-30 01:10:45 - pico-train - INFO - โโโ Learning Rate: 9.67e-06 |
|
2025-08-30 01:10:45 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:10:58 - pico-train - INFO - Step 30750 -- ๐ Training Metrics |
|
2025-08-30 01:10:58 - pico-train - INFO - โโโ Loss: 6.1865 |
|
2025-08-30 01:10:58 - pico-train - INFO - โโโ Learning Rate: 9.62e-06 |
|
2025-08-30 01:10:58 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:11:10 - pico-train - INFO - Step 30775 -- ๐ Training Metrics |
|
2025-08-30 01:11:10 - pico-train - INFO - โโโ Loss: 6.1659 |
|
2025-08-30 01:11:10 - pico-train - INFO - โโโ Learning Rate: 9.57e-06 |
|
2025-08-30 01:11:10 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:11:23 - pico-train - INFO - Step 30800 -- ๐ Training Metrics |
|
2025-08-30 01:11:23 - pico-train - INFO - โโโ Loss: 6.2281 |
|
2025-08-30 01:11:23 - pico-train - INFO - โโโ Learning Rate: 9.52e-06 |
|
2025-08-30 01:11:23 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:11:36 - pico-train - INFO - Step 30825 -- ๐ Training Metrics |
|
2025-08-30 01:11:36 - pico-train - INFO - โโโ Loss: 6.1316 |
|
2025-08-30 01:11:36 - pico-train - INFO - โโโ Learning Rate: 9.47e-06 |
|
2025-08-30 01:11:36 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:11:48 - pico-train - INFO - Step 30850 -- ๐ Training Metrics |
|
2025-08-30 01:11:48 - pico-train - INFO - โโโ Loss: 6.2135 |
|
2025-08-30 01:11:48 - pico-train - INFO - โโโ Learning Rate: 9.43e-06 |
|
2025-08-30 01:11:48 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:12:01 - pico-train - INFO - Step 30875 -- ๐ Training Metrics |
|
2025-08-30 01:12:01 - pico-train - INFO - โโโ Loss: 6.2395 |
|
2025-08-30 01:12:01 - pico-train - INFO - โโโ Learning Rate: 9.38e-06 |
|
2025-08-30 01:12:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:12:13 - pico-train - INFO - Step 30900 -- ๐ Training Metrics |
|
2025-08-30 01:12:13 - pico-train - INFO - โโโ Loss: 6.2277 |
|
2025-08-30 01:12:13 - pico-train - INFO - โโโ Learning Rate: 9.33e-06 |
|
2025-08-30 01:12:13 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:12:26 - pico-train - INFO - Step 30925 -- ๐ Training Metrics |
|
2025-08-30 01:12:26 - pico-train - INFO - โโโ Loss: 6.1863 |
|
2025-08-30 01:12:26 - pico-train - INFO - โโโ Learning Rate: 9.28e-06 |
|
2025-08-30 01:12:26 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:12:39 - pico-train - INFO - Step 30950 -- ๐ Training Metrics |
|
2025-08-30 01:12:39 - pico-train - INFO - โโโ Loss: 6.2133 |
|
2025-08-30 01:12:39 - pico-train - INFO - โโโ Learning Rate: 9.24e-06 |
|
2025-08-30 01:12:39 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:12:51 - pico-train - INFO - Step 30975 -- ๐ Training Metrics |
|
2025-08-30 01:12:51 - pico-train - INFO - โโโ Loss: 6.2132 |
|
2025-08-30 01:12:51 - pico-train - INFO - โโโ Learning Rate: 9.19e-06 |
|
2025-08-30 01:12:51 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:13:03 - pico-train - INFO - Step 31000 -- ๐พ Saving Checkpoint |
|
2025-08-30 01:14:57 - pico-train - INFO - Step 31000 -- ๐ Evaluation Results |
|
2025-08-30 01:14:57 - pico-train - INFO - โโโ paloma: 2.4568970443260916e+26 |
|
2025-08-30 01:14:59 - pico-train - INFO - Step 31000 -- ๐ Training Metrics |
|
2025-08-30 01:14:59 - pico-train - INFO - โโโ Loss: 6.1313 |
|
2025-08-30 01:14:59 - pico-train - INFO - โโโ Learning Rate: 9.14e-06 |
|
2025-08-30 01:14:59 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:14:59 - pico-train - INFO - Step 31000 -- ๐ Saving Learning Dynamics |
|
2025-08-30 01:15:15 - pico-train - INFO - Step 31025 -- ๐ Training Metrics |
|
2025-08-30 01:15:15 - pico-train - INFO - โโโ Loss: 6.2095 |
|
2025-08-30 01:15:15 - pico-train - INFO - โโโ Learning Rate: 9.09e-06 |
|
2025-08-30 01:15:15 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:15:27 - pico-train - INFO - Step 31050 -- ๐ Training Metrics |
|
2025-08-30 01:15:27 - pico-train - INFO - โโโ Loss: 6.1753 |
|
2025-08-30 01:15:27 - pico-train - INFO - โโโ Learning Rate: 9.05e-06 |
|
2025-08-30 01:15:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:15:40 - pico-train - INFO - Step 31075 -- ๐ Training Metrics |
|
2025-08-30 01:15:40 - pico-train - INFO - โโโ Loss: 6.1722 |
|
2025-08-30 01:15:40 - pico-train - INFO - โโโ Learning Rate: 9.00e-06 |
|
2025-08-30 01:15:40 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:15:53 - pico-train - INFO - Step 31100 -- ๐ Training Metrics |
|
2025-08-30 01:15:53 - pico-train - INFO - โโโ Loss: 6.1917 |
|
2025-08-30 01:15:53 - pico-train - INFO - โโโ Learning Rate: 8.95e-06 |
|
2025-08-30 01:15:53 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:16:05 - pico-train - INFO - Step 31125 -- ๐ Training Metrics |
|
2025-08-30 01:16:05 - pico-train - INFO - โโโ Loss: 6.1442 |
|
2025-08-30 01:16:05 - pico-train - INFO - โโโ Learning Rate: 8.90e-06 |
|
2025-08-30 01:16:05 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:16:18 - pico-train - INFO - Step 31150 -- ๐ Training Metrics |
|
2025-08-30 01:16:18 - pico-train - INFO - โโโ Loss: 6.2128 |
|
2025-08-30 01:16:18 - pico-train - INFO - โโโ Learning Rate: 8.86e-06 |
|
2025-08-30 01:16:18 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:16:30 - pico-train - INFO - Step 31175 -- ๐ Training Metrics |
|
2025-08-30 01:16:30 - pico-train - INFO - โโโ Loss: 6.1192 |
|
2025-08-30 01:16:30 - pico-train - INFO - โโโ Learning Rate: 8.81e-06 |
|
2025-08-30 01:16:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:16:43 - pico-train - INFO - Step 31200 -- ๐ Training Metrics |
|
2025-08-30 01:16:43 - pico-train - INFO - โโโ Loss: 6.1648 |
|
2025-08-30 01:16:43 - pico-train - INFO - โโโ Learning Rate: 8.76e-06 |
|
2025-08-30 01:16:43 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:16:55 - pico-train - INFO - Step 31225 -- ๐ Training Metrics |
|
2025-08-30 01:16:55 - pico-train - INFO - โโโ Loss: 6.2030 |
|
2025-08-30 01:16:55 - pico-train - INFO - โโโ Learning Rate: 8.72e-06 |
|
2025-08-30 01:16:55 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:17:08 - pico-train - INFO - Step 31250 -- ๐ Training Metrics |
|
2025-08-30 01:17:08 - pico-train - INFO - โโโ Loss: 6.1564 |
|
2025-08-30 01:17:08 - pico-train - INFO - โโโ Learning Rate: 8.67e-06 |
|
2025-08-30 01:17:08 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:17:21 - pico-train - INFO - Step 31275 -- ๐ Training Metrics |
|
2025-08-30 01:17:21 - pico-train - INFO - โโโ Loss: 6.2193 |
|
2025-08-30 01:17:21 - pico-train - INFO - โโโ Learning Rate: 8.62e-06 |
|
2025-08-30 01:17:21 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:17:33 - pico-train - INFO - Step 31300 -- ๐ Training Metrics |
|
2025-08-30 01:17:33 - pico-train - INFO - โโโ Loss: 6.1630 |
|
2025-08-30 01:17:33 - pico-train - INFO - โโโ Learning Rate: 8.58e-06 |
|
2025-08-30 01:17:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:17:46 - pico-train - INFO - Step 31325 -- ๐ Training Metrics |
|
2025-08-30 01:17:46 - pico-train - INFO - โโโ Loss: 6.1765 |
|
2025-08-30 01:17:46 - pico-train - INFO - โโโ Learning Rate: 8.53e-06 |
|
2025-08-30 01:17:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:17:58 - pico-train - INFO - Step 31350 -- ๐ Training Metrics |
|
2025-08-30 01:17:58 - pico-train - INFO - โโโ Loss: 6.2315 |
|
2025-08-30 01:17:58 - pico-train - INFO - โโโ Learning Rate: 8.49e-06 |
|
2025-08-30 01:17:58 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:18:11 - pico-train - INFO - Step 31375 -- ๐ Training Metrics |
|
2025-08-30 01:18:11 - pico-train - INFO - โโโ Loss: 6.1719 |
|
2025-08-30 01:18:11 - pico-train - INFO - โโโ Learning Rate: 8.44e-06 |
|
2025-08-30 01:18:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:18:24 - pico-train - INFO - Step 31400 -- ๐ Training Metrics |
|
2025-08-30 01:18:24 - pico-train - INFO - โโโ Loss: 6.2234 |
|
2025-08-30 01:18:24 - pico-train - INFO - โโโ Learning Rate: 8.39e-06 |
|
2025-08-30 01:18:24 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:18:36 - pico-train - INFO - Step 31425 -- ๐ Training Metrics |
|
2025-08-30 01:18:36 - pico-train - INFO - โโโ Loss: 6.1782 |
|
2025-08-30 01:18:36 - pico-train - INFO - โโโ Learning Rate: 8.35e-06 |
|
2025-08-30 01:18:36 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:18:49 - pico-train - INFO - Step 31450 -- ๐ Training Metrics |
|
2025-08-30 01:18:49 - pico-train - INFO - โโโ Loss: 6.1711 |
|
2025-08-30 01:18:49 - pico-train - INFO - โโโ Learning Rate: 8.30e-06 |
|
2025-08-30 01:18:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:19:01 - pico-train - INFO - Step 31475 -- ๐ Training Metrics |
|
2025-08-30 01:19:01 - pico-train - INFO - โโโ Loss: 6.1834 |
|
2025-08-30 01:19:01 - pico-train - INFO - โโโ Learning Rate: 8.26e-06 |
|
2025-08-30 01:19:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:19:14 - pico-train - INFO - Step 31500 -- ๐พ Saving Checkpoint |
|
2025-08-30 01:21:14 - pico-train - INFO - Step 31500 -- ๐ Evaluation Results |
|
2025-08-30 01:21:14 - pico-train - INFO - โโโ paloma: 2.8663430235000883e+26 |
|
2025-08-30 01:21:17 - pico-train - INFO - Step 31500 -- ๐ Training Metrics |
|
2025-08-30 01:21:17 - pico-train - INFO - โโโ Loss: 6.1338 |
|
2025-08-30 01:21:17 - pico-train - INFO - โโโ Learning Rate: 8.21e-06 |
|
2025-08-30 01:21:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:21:17 - pico-train - INFO - Step 31500 -- ๐ Saving Learning Dynamics |
|
2025-08-30 01:21:33 - pico-train - INFO - Step 31525 -- ๐ Training Metrics |
|
2025-08-30 01:21:33 - pico-train - INFO - โโโ Loss: 6.1819 |
|
2025-08-30 01:21:33 - pico-train - INFO - โโโ Learning Rate: 8.17e-06 |
|
2025-08-30 01:21:33 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:21:46 - pico-train - INFO - Step 31550 -- ๐ Training Metrics |
|
2025-08-30 01:21:46 - pico-train - INFO - โโโ Loss: 6.1695 |
|
2025-08-30 01:21:46 - pico-train - INFO - โโโ Learning Rate: 8.12e-06 |
|
2025-08-30 01:21:46 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:21:58 - pico-train - INFO - Step 31575 -- ๐ Training Metrics |
|
2025-08-30 01:21:58 - pico-train - INFO - โโโ Loss: 6.2089 |
|
2025-08-30 01:21:58 - pico-train - INFO - โโโ Learning Rate: 8.08e-06 |
|
2025-08-30 01:21:58 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:22:11 - pico-train - INFO - Step 31600 -- ๐ Training Metrics |
|
2025-08-30 01:22:11 - pico-train - INFO - โโโ Loss: 6.1555 |
|
2025-08-30 01:22:11 - pico-train - INFO - โโโ Learning Rate: 8.03e-06 |
|
2025-08-30 01:22:11 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:22:24 - pico-train - INFO - Step 31625 -- ๐ Training Metrics |
|
2025-08-30 01:22:24 - pico-train - INFO - โโโ Loss: 6.1820 |
|
2025-08-30 01:22:24 - pico-train - INFO - โโโ Learning Rate: 7.98e-06 |
|
2025-08-30 01:22:24 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:22:36 - pico-train - INFO - Step 31650 -- ๐ Training Metrics |
|
2025-08-30 01:22:36 - pico-train - INFO - โโโ Loss: 6.1091 |
|
2025-08-30 01:22:36 - pico-train - INFO - โโโ Learning Rate: 7.94e-06 |
|
2025-08-30 01:22:36 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:22:49 - pico-train - INFO - Step 31675 -- ๐ Training Metrics |
|
2025-08-30 01:22:49 - pico-train - INFO - โโโ Loss: 6.2098 |
|
2025-08-30 01:22:49 - pico-train - INFO - โโโ Learning Rate: 7.90e-06 |
|
2025-08-30 01:22:49 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:23:01 - pico-train - INFO - Step 31700 -- ๐ Training Metrics |
|
2025-08-30 01:23:01 - pico-train - INFO - โโโ Loss: 6.0611 |
|
2025-08-30 01:23:01 - pico-train - INFO - โโโ Learning Rate: 7.85e-06 |
|
2025-08-30 01:23:01 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:23:14 - pico-train - INFO - Step 31725 -- ๐ Training Metrics |
|
2025-08-30 01:23:14 - pico-train - INFO - โโโ Loss: 6.1088 |
|
2025-08-30 01:23:14 - pico-train - INFO - โโโ Learning Rate: 7.81e-06 |
|
2025-08-30 01:23:14 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:23:27 - pico-train - INFO - Step 31750 -- ๐ Training Metrics |
|
2025-08-30 01:23:27 - pico-train - INFO - โโโ Loss: 6.2220 |
|
2025-08-30 01:23:27 - pico-train - INFO - โโโ Learning Rate: 7.76e-06 |
|
2025-08-30 01:23:27 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:23:39 - pico-train - INFO - Step 31775 -- ๐ Training Metrics |
|
2025-08-30 01:23:39 - pico-train - INFO - โโโ Loss: 6.2271 |
|
2025-08-30 01:23:39 - pico-train - INFO - โโโ Learning Rate: 7.72e-06 |
|
2025-08-30 01:23:39 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:23:52 - pico-train - INFO - Step 31800 -- ๐ Training Metrics |
|
2025-08-30 01:23:52 - pico-train - INFO - โโโ Loss: 6.1465 |
|
2025-08-30 01:23:52 - pico-train - INFO - โโโ Learning Rate: 7.67e-06 |
|
2025-08-30 01:23:52 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:24:04 - pico-train - INFO - Step 31825 -- ๐ Training Metrics |
|
2025-08-30 01:24:04 - pico-train - INFO - โโโ Loss: 6.1742 |
|
2025-08-30 01:24:04 - pico-train - INFO - โโโ Learning Rate: 7.63e-06 |
|
2025-08-30 01:24:04 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:24:17 - pico-train - INFO - Step 31850 -- ๐ Training Metrics |
|
2025-08-30 01:24:17 - pico-train - INFO - โโโ Loss: 6.2199 |
|
2025-08-30 01:24:17 - pico-train - INFO - โโโ Learning Rate: 7.58e-06 |
|
2025-08-30 01:24:17 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:24:30 - pico-train - INFO - Step 31875 -- ๐ Training Metrics |
|
2025-08-30 01:24:30 - pico-train - INFO - โโโ Loss: 6.1934 |
|
2025-08-30 01:24:30 - pico-train - INFO - โโโ Learning Rate: 7.54e-06 |
|
2025-08-30 01:24:30 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:24:42 - pico-train - INFO - Step 31900 -- ๐ Training Metrics |
|
2025-08-30 01:24:42 - pico-train - INFO - โโโ Loss: 6.1503 |
|
2025-08-30 01:24:42 - pico-train - INFO - โโโ Learning Rate: 7.50e-06 |
|
2025-08-30 01:24:42 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:24:55 - pico-train - INFO - Step 31925 -- ๐ Training Metrics |
|
2025-08-30 01:24:55 - pico-train - INFO - โโโ Loss: 6.0399 |
|
2025-08-30 01:24:55 - pico-train - INFO - โโโ Learning Rate: 7.45e-06 |
|
2025-08-30 01:24:55 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:25:07 - pico-train - INFO - Step 31950 -- ๐ Training Metrics |
|
2025-08-30 01:25:07 - pico-train - INFO - โโโ Loss: 6.2147 |
|
2025-08-30 01:25:07 - pico-train - INFO - โโโ Learning Rate: 7.41e-06 |
|
2025-08-30 01:25:07 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:25:20 - pico-train - INFO - Step 31975 -- ๐ Training Metrics |
|
2025-08-30 01:25:20 - pico-train - INFO - โโโ Loss: 6.1952 |
|
2025-08-30 01:25:20 - pico-train - INFO - โโโ Learning Rate: 7.37e-06 |
|
2025-08-30 01:25:20 - pico-train - INFO - โโโ Inf/NaN count: 0 |
|
2025-08-30 01:25:32 - pico-train - INFO - Step 32000 -- ๐พ Saving Checkpoint |
|
|