ThomasTheMaker's picture
Upload folder using huggingface_hub
c23c549 verified
2025-08-30 05:22:55 - pico-train - INFO - Step 0 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:22:55 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 05:22:56 - pico-train - INFO - ==================================================
2025-08-30 05:22:56 - pico-train - INFO - โœจ Training Configuration
2025-08-30 05:22:56 - pico-train - INFO - ==================================================
2025-08-30 05:22:56 - pico-train - INFO - โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ checkpointing: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ checkpoints_dir: checkpoints โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ evaluation: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ eval_results_dir: eval_results โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ fabric_checkpoint_dir: fabric_state โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ fabric_checkpoint_filename: checkpoint.pt โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ hf_checkpoint: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ collection_slug: null โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ repo_id: ThomasTheMaker/pico-decoder-tiny โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ learning_dynamics: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ batch_size: 1 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ eval_data: null โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ layer_suffixes: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ - attention.v_proj โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ - attention.o_proj โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ - swiglu.w_2 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ sequence_idx: -1 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ learning_dynamics_dir: learning_dynamics โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ logs_dir: logs โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ run_name: pico-decoder-tiny-dolma10M-v1 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ runs_dir: runs โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ save_every_n_steps: 2000 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ save_to_hf: true โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ training: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ auto_resume: true โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ data: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ dataloader: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ batch_size: 16 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ dataset: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ name: ThomasTheMaker/pretokenized-dolma-10M โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ tokenizer: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ name: allenai/OLMo-7B-0724-hf โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ vocab_size: 50304 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ evaluation: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ metrics: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ - paloma โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ paloma: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ batch_size: 1 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ dataset_name: pico-lm/pretokenized-paloma-tinsy โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ dataset_split: val โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ max_length: 2048 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ model: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ activation_hidden_dim: 384 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ attention_n_heads: 12 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ attention_n_kv_heads: 4 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ batch_size: 1024 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ d_model: 96 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ max_seq_len: 2048 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ model_type: pico_decoder โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ n_layers: 12 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ norm_eps: 1.0e-06 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ position_emb_theta: 10000.0 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ vocab_size: 50304 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ monitoring: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ logging: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ log_every_n_steps: 100 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ log_level: INFO โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ save_to_wandb: false โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ wandb: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ entity: boymyc โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ project: pico-decoder-tiny โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ training: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ fabric: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ accelerator: cuda โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ num_devices: 1 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ num_nodes: 1 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ precision: bf16-mixed โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ max_steps: 100000 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ optimization: โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ gradient_accumulation_steps: 1 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ lr: 0.0002 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ lr_scheduler: cosine โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ lr_warmup_steps: 2000 โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ optimizer: adamw โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ”‚ โ”‚
2025-08-30 05:22:56 - pico-train - INFO - โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
2025-08-30 05:22:56 - pico-train - INFO - ==================================================
2025-08-30 05:22:56 - pico-train - INFO - โ›ญ Runtime Summary:
2025-08-30 05:22:56 - pico-train - INFO - ==================================================
2025-08-30 05:22:56 - pico-train - INFO - Starting from step: 0
2025-08-30 05:22:56 - pico-train - INFO - Model Setup:
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Total Parameters: 11,282,784
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Trainable Parameters: 11,282,784
2025-08-30 05:22:56 - pico-train - INFO - Distributed Setup:
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Number of Devices: 1
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Device Type: NVIDIA H100 80GB HBM3
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Available Memory: 85.03 GB
2025-08-30 05:22:56 - pico-train - INFO - Software Setup:
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Python Version: 3.12.3
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ PyTorch Version: 2.8.0+cu128
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ CUDA Version: 12.8
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Operating System: Linux 6.8.0-71-generic
2025-08-30 05:22:56 - pico-train - INFO - Batch Size Configuration:
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Global Batch Size: 16
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Per Device Batch Size: 16
2025-08-30 05:22:56 - pico-train - INFO - โ””โ”€ Gradient Accumulation Steps: 1
2025-08-30 05:22:56 - pico-train - INFO - ==================================================
2025-08-30 05:22:57 - pico-train - INFO - Step 0 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:22:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 10.9884
2025-08-30 05:22:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 0.00e+00
2025-08-30 05:22:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:22:57 - pico-train - INFO - Step 0 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:23:55 - pico-train - INFO - Step 100 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:23:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 10.9746
2025-08-30 05:23:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-05
2025-08-30 05:23:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:24:49 - pico-train - INFO - Step 200 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:24:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 10.7653
2025-08-30 05:24:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-30 05:24:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:25:44 - pico-train - INFO - Step 300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:25:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 10.2902
2025-08-30 05:25:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.00e-05
2025-08-30 05:25:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:26:38 - pico-train - INFO - Step 400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:26:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 9.8373
2025-08-30 05:26:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.00e-05
2025-08-30 05:26:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:27:31 - pico-train - INFO - Step 500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:27:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 9.3629
2025-08-30 05:27:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.00e-05
2025-08-30 05:27:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:28:24 - pico-train - INFO - Step 600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:28:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 8.8887
2025-08-30 05:28:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.00e-05
2025-08-30 05:28:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:29:19 - pico-train - INFO - Step 700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:29:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 8.4408
2025-08-30 05:29:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.00e-05
2025-08-30 05:29:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:30:13 - pico-train - INFO - Step 800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:30:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 8.0906
2025-08-30 05:30:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.00e-05
2025-08-30 05:30:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:31:07 - pico-train - INFO - Step 900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:31:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.8459
2025-08-30 05:31:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.00e-05
2025-08-30 05:31:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:32:02 - pico-train - INFO - Step 1000 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:32:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.6972
2025-08-30 05:32:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-04
2025-08-30 05:32:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:32:55 - pico-train - INFO - Step 1100 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:32:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.5570
2025-08-30 05:32:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-04
2025-08-30 05:32:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:33:48 - pico-train - INFO - Step 1200 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:33:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.4823
2025-08-30 05:33:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-04
2025-08-30 05:33:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:34:42 - pico-train - INFO - Step 1300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:34:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.3624
2025-08-30 05:34:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-04
2025-08-30 05:34:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:35:33 - pico-train - INFO - Step 1400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:35:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.2538
2025-08-30 05:35:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-30 05:35:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:36:26 - pico-train - INFO - Step 1500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:36:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.1582
2025-08-30 05:36:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-30 05:36:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:37:18 - pico-train - INFO - Step 1600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:37:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.0462
2025-08-30 05:37:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 05:37:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:38:11 - pico-train - INFO - Step 1700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:38:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.9729
2025-08-30 05:38:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 05:38:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:39:04 - pico-train - INFO - Step 1800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:39:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.8825
2025-08-30 05:39:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 05:39:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:39:56 - pico-train - INFO - Step 1900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:39:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.8003
2025-08-30 05:39:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 05:39:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:40:48 - pico-train - INFO - Step 2000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:42:56 - pico-train - INFO - Step 2000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:42:56 - pico-train - INFO - โ””โ”€โ”€ paloma: 9.921214391079047e+20
2025-08-30 05:42:57 - pico-train - INFO - Step 2000 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:42:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.7360
2025-08-30 05:42:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:42:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:42:57 - pico-train - INFO - Step 2000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:43:54 - pico-train - INFO - Step 2100 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:43:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.6658
2025-08-30 05:43:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:43:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:44:47 - pico-train - INFO - Step 2200 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:44:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.6040
2025-08-30 05:44:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:44:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:45:39 - pico-train - INFO - Step 2300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:45:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.5360
2025-08-30 05:45:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:45:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:46:31 - pico-train - INFO - Step 2400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:46:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.5011
2025-08-30 05:46:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:46:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:47:24 - pico-train - INFO - Step 2500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:47:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.4541
2025-08-30 05:47:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:47:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:48:17 - pico-train - INFO - Step 2600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:48:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.4299
2025-08-30 05:48:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:48:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:49:10 - pico-train - INFO - Step 2700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:49:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.3677
2025-08-30 05:49:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:49:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:50:02 - pico-train - INFO - Step 2800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:50:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.3537
2025-08-30 05:50:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:50:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:50:54 - pico-train - INFO - Step 2900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:50:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.3225
2025-08-30 05:50:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:50:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:51:47 - pico-train - INFO - Step 3000 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:51:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2806
2025-08-30 05:51:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:51:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:52:39 - pico-train - INFO - Step 3100 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:52:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2626
2025-08-30 05:52:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:52:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:53:32 - pico-train - INFO - Step 3200 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:53:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2206
2025-08-30 05:53:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:53:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:54:24 - pico-train - INFO - Step 3300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:54:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2140
2025-08-30 05:54:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:54:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:55:17 - pico-train - INFO - Step 3400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:55:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1525
2025-08-30 05:55:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:55:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:56:11 - pico-train - INFO - Step 3500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:56:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1104
2025-08-30 05:56:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:56:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:57:05 - pico-train - INFO - Step 3600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:57:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1327
2025-08-30 05:57:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:57:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:57:58 - pico-train - INFO - Step 3700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:57:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1046
2025-08-30 05:57:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:57:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:58:51 - pico-train - INFO - Step 3800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:58:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0910
2025-08-30 05:58:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:58:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:59:44 - pico-train - INFO - Step 3900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:59:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0369
2025-08-30 05:59:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 05:59:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:00:36 - pico-train - INFO - Step 4000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:02:50 - pico-train - INFO - Step 4000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:02:50 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.516233710165711e+23
2025-08-30 06:02:51 - pico-train - INFO - Step 4000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:02:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0450
2025-08-30 06:02:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:02:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:02:51 - pico-train - INFO - Step 4000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:03:48 - pico-train - INFO - Step 4100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:03:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0120
2025-08-30 06:03:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:03:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:04:39 - pico-train - INFO - Step 4200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:04:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9897
2025-08-30 06:04:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:04:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:05:33 - pico-train - INFO - Step 4300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:05:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9636
2025-08-30 06:05:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:05:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:06:25 - pico-train - INFO - Step 4400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:06:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9759
2025-08-30 06:06:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:06:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:07:18 - pico-train - INFO - Step 4500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:07:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9551
2025-08-30 06:07:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:07:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:08:10 - pico-train - INFO - Step 4600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:08:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9165
2025-08-30 06:08:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:08:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:09:02 - pico-train - INFO - Step 4700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:09:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8939
2025-08-30 06:09:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:09:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:09:54 - pico-train - INFO - Step 4800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:09:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8769
2025-08-30 06:09:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:09:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:10:47 - pico-train - INFO - Step 4900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:10:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8605
2025-08-30 06:10:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:10:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:11:39 - pico-train - INFO - Step 5000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:11:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8727
2025-08-30 06:11:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:11:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:12:31 - pico-train - INFO - Step 5100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:12:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8446
2025-08-30 06:12:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 06:12:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:13:24 - pico-train - INFO - Step 5200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:13:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8289
2025-08-30 06:13:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:13:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:14:16 - pico-train - INFO - Step 5300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:14:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8220
2025-08-30 06:14:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:14:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:15:09 - pico-train - INFO - Step 5400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:15:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8132
2025-08-30 06:15:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:15:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:16:01 - pico-train - INFO - Step 5500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:16:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7878
2025-08-30 06:16:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:16:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:16:53 - pico-train - INFO - Step 5600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:16:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7639
2025-08-30 06:16:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:16:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:17:45 - pico-train - INFO - Step 5700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:17:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7698
2025-08-30 06:17:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:17:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:18:38 - pico-train - INFO - Step 5800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:18:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7458
2025-08-30 06:18:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:18:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:19:31 - pico-train - INFO - Step 5900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:19:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7482
2025-08-30 06:19:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:19:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:20:23 - pico-train - INFO - Step 6000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:22:23 - pico-train - INFO - Step 6000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:22:23 - pico-train - INFO - โ””โ”€โ”€ paloma: 5.501352274432362e+26
2025-08-30 06:22:25 - pico-train - INFO - Step 6000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:22:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7402
2025-08-30 06:22:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:22:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:22:25 - pico-train - INFO - Step 6000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:23:21 - pico-train - INFO - Step 6100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:23:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7377
2025-08-30 06:23:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:23:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:24:13 - pico-train - INFO - Step 6200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:24:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6952
2025-08-30 06:24:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:24:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:25:06 - pico-train - INFO - Step 6300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:25:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6845
2025-08-30 06:25:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:25:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:26:00 - pico-train - INFO - Step 6400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:26:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6903
2025-08-30 06:26:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:26:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:26:53 - pico-train - INFO - Step 6500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:26:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6877
2025-08-30 06:26:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:26:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:27:47 - pico-train - INFO - Step 6600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:27:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6538
2025-08-30 06:27:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:27:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:28:41 - pico-train - INFO - Step 6700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:28:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6437
2025-08-30 06:28:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:28:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:29:35 - pico-train - INFO - Step 6800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:29:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6444
2025-08-30 06:29:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:29:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:30:28 - pico-train - INFO - Step 6900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:30:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6238
2025-08-30 06:30:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:30:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:31:21 - pico-train - INFO - Step 7000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:31:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6188
2025-08-30 06:31:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:31:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:32:14 - pico-train - INFO - Step 7100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:32:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5782
2025-08-30 06:32:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:32:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:33:08 - pico-train - INFO - Step 7200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:33:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6112
2025-08-30 06:33:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:33:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:34:01 - pico-train - INFO - Step 7300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:34:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5985
2025-08-30 06:34:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:34:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:34:55 - pico-train - INFO - Step 7400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:34:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6009
2025-08-30 06:34:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 06:34:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:35:48 - pico-train - INFO - Step 7500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:35:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5714
2025-08-30 06:35:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:35:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:36:41 - pico-train - INFO - Step 7600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:36:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5714
2025-08-30 06:36:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:36:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:37:36 - pico-train - INFO - Step 7700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:37:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5653
2025-08-30 06:37:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:37:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:38:28 - pico-train - INFO - Step 7800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:38:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5559
2025-08-30 06:38:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:38:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:39:22 - pico-train - INFO - Step 7900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:39:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5568
2025-08-30 06:39:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:39:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:40:15 - pico-train - INFO - Step 8000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:42:29 - pico-train - INFO - Step 8000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:42:29 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.7741731039516784e+30
2025-08-30 06:42:33 - pico-train - INFO - Step 8000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:42:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5269
2025-08-30 06:42:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:42:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:42:33 - pico-train - INFO - Step 8000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:43:31 - pico-train - INFO - Step 8100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:43:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5294
2025-08-30 06:43:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:43:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:44:24 - pico-train - INFO - Step 8200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:44:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5352
2025-08-30 06:44:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:44:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:45:16 - pico-train - INFO - Step 8300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:45:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5280
2025-08-30 06:45:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:45:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:46:08 - pico-train - INFO - Step 8400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:46:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4997
2025-08-30 06:46:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:46:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:47:00 - pico-train - INFO - Step 8500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:47:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4873
2025-08-30 06:47:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:47:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:47:53 - pico-train - INFO - Step 8600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:47:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5047
2025-08-30 06:47:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:47:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:48:45 - pico-train - INFO - Step 8700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:48:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4809
2025-08-30 06:48:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:48:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:49:37 - pico-train - INFO - Step 8800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:49:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4859
2025-08-30 06:49:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:49:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:50:30 - pico-train - INFO - Step 8900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:50:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4690
2025-08-30 06:50:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 06:50:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:51:23 - pico-train - INFO - Step 9000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:51:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4476
2025-08-30 06:51:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:51:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:52:16 - pico-train - INFO - Step 9100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:52:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4383
2025-08-30 06:52:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:52:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:53:07 - pico-train - INFO - Step 9200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:53:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4569
2025-08-30 06:53:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:53:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:53:59 - pico-train - INFO - Step 9300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:53:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4367
2025-08-30 06:53:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:53:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:54:52 - pico-train - INFO - Step 9400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:54:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4623
2025-08-30 06:54:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:54:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:55:45 - pico-train - INFO - Step 9500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:55:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4268
2025-08-30 06:55:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:55:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:56:37 - pico-train - INFO - Step 9600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:56:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4639
2025-08-30 06:56:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:56:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:57:28 - pico-train - INFO - Step 9700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:57:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4521
2025-08-30 06:57:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:57:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:58:21 - pico-train - INFO - Step 9800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:58:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4139
2025-08-30 06:58:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:58:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:59:13 - pico-train - INFO - Step 9900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:59:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4026
2025-08-30 06:59:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 06:59:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:00:05 - pico-train - INFO - Step 10000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:02:19 - pico-train - INFO - Step 10000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:02:19 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.0181753654885411e+35
2025-08-30 07:02:21 - pico-train - INFO - Step 10000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:02:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4191
2025-08-30 07:02:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 07:02:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:02:21 - pico-train - INFO - Step 10000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:03:18 - pico-train - INFO - Step 10100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:03:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3756
2025-08-30 07:03:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 07:03:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:04:10 - pico-train - INFO - Step 10200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:04:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3976
2025-08-30 07:04:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 07:04:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:05:04 - pico-train - INFO - Step 10300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:05:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4049
2025-08-30 07:05:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:05:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:05:56 - pico-train - INFO - Step 10400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:05:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3991
2025-08-30 07:05:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:05:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:06:48 - pico-train - INFO - Step 10500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:06:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4016
2025-08-30 07:06:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:06:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:07:40 - pico-train - INFO - Step 10600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:07:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3924
2025-08-30 07:07:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:07:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:08:32 - pico-train - INFO - Step 10700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:08:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3781
2025-08-30 07:08:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:08:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:09:25 - pico-train - INFO - Step 10800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:09:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3433
2025-08-30 07:09:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:09:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:10:17 - pico-train - INFO - Step 10900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:10:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3610
2025-08-30 07:10:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:10:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:11:09 - pico-train - INFO - Step 11000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:11:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3561
2025-08-30 07:11:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:11:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:12:01 - pico-train - INFO - Step 11100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:12:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3818
2025-08-30 07:12:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:12:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:12:53 - pico-train - INFO - Step 11200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:12:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3596
2025-08-30 07:12:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:12:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:13:46 - pico-train - INFO - Step 11300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:13:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3558
2025-08-30 07:13:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 07:13:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:14:38 - pico-train - INFO - Step 11400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:14:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3484
2025-08-30 07:14:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:14:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:15:31 - pico-train - INFO - Step 11500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:15:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3399
2025-08-30 07:15:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:15:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:16:24 - pico-train - INFO - Step 11600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:16:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3252
2025-08-30 07:16:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:16:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:17:17 - pico-train - INFO - Step 11700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:17:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3194
2025-08-30 07:17:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:17:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:18:09 - pico-train - INFO - Step 11800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:18:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3387
2025-08-30 07:18:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:18:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:19:01 - pico-train - INFO - Step 11900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:19:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3459
2025-08-30 07:19:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:19:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:19:53 - pico-train - INFO - Step 12000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:22:04 - pico-train - INFO - Step 12000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:22:04 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 07:22:06 - pico-train - INFO - Step 12000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:22:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3216
2025-08-30 07:22:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:22:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:22:06 - pico-train - INFO - Step 12000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:23:04 - pico-train - INFO - Step 12100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:23:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3164
2025-08-30 07:23:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:23:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:23:57 - pico-train - INFO - Step 12200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:23:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3075
2025-08-30 07:23:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:23:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:24:50 - pico-train - INFO - Step 12300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:24:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2908
2025-08-30 07:24:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 07:24:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:25:43 - pico-train - INFO - Step 12400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:25:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2970
2025-08-30 07:25:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:25:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:26:35 - pico-train - INFO - Step 12500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:26:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2850
2025-08-30 07:26:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:26:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:27:28 - pico-train - INFO - Step 12600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:27:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3027
2025-08-30 07:27:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:27:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:28:20 - pico-train - INFO - Step 12700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:28:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2792
2025-08-30 07:28:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:28:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:29:14 - pico-train - INFO - Step 12800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:29:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3026
2025-08-30 07:29:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:29:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:30:05 - pico-train - INFO - Step 12900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:30:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2918
2025-08-30 07:30:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:30:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:30:58 - pico-train - INFO - Step 13000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:30:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3032
2025-08-30 07:30:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:30:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:31:50 - pico-train - INFO - Step 13100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:31:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2887
2025-08-30 07:31:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:31:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:32:43 - pico-train - INFO - Step 13200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:32:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2853
2025-08-30 07:32:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:32:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:33:35 - pico-train - INFO - Step 13300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:33:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2829
2025-08-30 07:33:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 07:33:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:34:27 - pico-train - INFO - Step 13400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:34:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2773
2025-08-30 07:34:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 07:34:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:35:20 - pico-train - INFO - Step 13500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:35:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2645
2025-08-30 07:35:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 07:35:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:36:13 - pico-train - INFO - Step 13600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:36:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2692
2025-08-30 07:36:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 07:36:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:37:07 - pico-train - INFO - Step 13700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:37:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2500
2025-08-30 07:37:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 07:37:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:38:01 - pico-train - INFO - Step 13800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:38:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2655
2025-08-30 07:38:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 07:38:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:38:54 - pico-train - INFO - Step 13900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:38:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2591
2025-08-30 07:38:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 07:38:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:39:46 - pico-train - INFO - Step 14000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:41:49 - pico-train - INFO - Step 14000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:41:49 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 07:41:51 - pico-train - INFO - Step 14000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:41:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2632
2025-08-30 07:41:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 07:41:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:41:51 - pico-train - INFO - Step 14000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:42:49 - pico-train - INFO - Step 14100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:42:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2383
2025-08-30 07:42:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 07:42:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:43:41 - pico-train - INFO - Step 14200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:43:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2505
2025-08-30 07:43:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 07:43:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:44:34 - pico-train - INFO - Step 14300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:44:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2444
2025-08-30 07:44:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 07:44:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:45:26 - pico-train - INFO - Step 14400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:45:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2410
2025-08-30 07:45:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 07:45:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:46:18 - pico-train - INFO - Step 14500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:46:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2553
2025-08-30 07:46:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 07:46:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:47:11 - pico-train - INFO - Step 14600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:47:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2326
2025-08-30 07:47:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 07:47:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:48:02 - pico-train - INFO - Step 14700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:48:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2398
2025-08-30 07:48:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 07:48:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:48:55 - pico-train - INFO - Step 14800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:48:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2355
2025-08-30 07:48:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 07:48:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:49:48 - pico-train - INFO - Step 14900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:49:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2329
2025-08-30 07:49:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 07:49:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:50:40 - pico-train - INFO - Step 15000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:50:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2309
2025-08-30 07:50:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 07:50:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:51:32 - pico-train - INFO - Step 15100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:51:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2283
2025-08-30 07:51:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 07:51:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:52:24 - pico-train - INFO - Step 15200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:52:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2301
2025-08-30 07:52:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 07:52:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:53:16 - pico-train - INFO - Step 15300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:53:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2524
2025-08-30 07:53:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 07:53:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:54:09 - pico-train - INFO - Step 15400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:54:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2164
2025-08-30 07:54:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 07:54:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:55:01 - pico-train - INFO - Step 15500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:55:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2332
2025-08-30 07:55:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 07:55:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:55:53 - pico-train - INFO - Step 15600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:55:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2263
2025-08-30 07:55:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 07:55:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:56:46 - pico-train - INFO - Step 15700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:56:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2087
2025-08-30 07:56:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 07:56:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:57:38 - pico-train - INFO - Step 15800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:57:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2199
2025-08-30 07:57:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 07:57:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:58:31 - pico-train - INFO - Step 15900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:58:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2056
2025-08-30 07:58:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 07:58:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:59:23 - pico-train - INFO - Step 16000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:01:22 - pico-train - INFO - Step 16000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:01:22 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 08:01:23 - pico-train - INFO - Step 16000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:01:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2088
2025-08-30 08:01:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 08:01:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:01:23 - pico-train - INFO - Step 16000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:02:20 - pico-train - INFO - Step 16100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:02:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1931
2025-08-30 08:02:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 08:02:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:03:13 - pico-train - INFO - Step 16200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:03:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1773
2025-08-30 08:03:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 08:03:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:04:07 - pico-train - INFO - Step 16300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:04:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2032
2025-08-30 08:04:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 08:04:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:05:01 - pico-train - INFO - Step 16400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:05:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1868
2025-08-30 08:05:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 08:05:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:05:54 - pico-train - INFO - Step 16500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:05:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1764
2025-08-30 08:05:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 08:05:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:06:46 - pico-train - INFO - Step 16600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:06:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2021
2025-08-30 08:06:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 08:06:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:07:41 - pico-train - INFO - Step 16700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:07:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1797
2025-08-30 08:07:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 08:07:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:08:33 - pico-train - INFO - Step 16800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:08:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1489
2025-08-30 08:08:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 08:08:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:09:28 - pico-train - INFO - Step 16900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:09:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1805
2025-08-30 08:09:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 08:09:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:10:20 - pico-train - INFO - Step 17000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:10:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1844
2025-08-30 08:10:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 08:10:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:11:14 - pico-train - INFO - Step 17100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:11:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1826
2025-08-30 08:11:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 08:11:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:12:08 - pico-train - INFO - Step 17200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:12:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1568
2025-08-30 08:12:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 08:12:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:13:02 - pico-train - INFO - Step 17300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:13:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2075
2025-08-30 08:13:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 08:13:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:13:56 - pico-train - INFO - Step 17400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:13:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1649
2025-08-30 08:13:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 08:13:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:14:49 - pico-train - INFO - Step 17500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:14:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1506
2025-08-30 08:14:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 08:14:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:15:42 - pico-train - INFO - Step 17600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:15:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1757
2025-08-30 08:15:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 08:15:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:16:36 - pico-train - INFO - Step 17700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:16:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1580
2025-08-30 08:16:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 08:16:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:17:30 - pico-train - INFO - Step 17800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:17:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1309
2025-08-30 08:17:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 08:17:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:18:23 - pico-train - INFO - Step 17900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:18:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1601
2025-08-30 08:18:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 08:18:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:19:16 - pico-train - INFO - Step 18000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:21:31 - pico-train - INFO - Step 18000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:21:31 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 08:21:32 - pico-train - INFO - Step 18000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:21:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1612
2025-08-30 08:21:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 08:21:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:21:32 - pico-train - INFO - Step 18000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:22:30 - pico-train - INFO - Step 18100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:22:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1556
2025-08-30 08:22:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 08:22:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:23:24 - pico-train - INFO - Step 18200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:23:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1406
2025-08-30 08:23:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 08:23:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:24:17 - pico-train - INFO - Step 18300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:24:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1410
2025-08-30 08:24:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 08:24:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:25:10 - pico-train - INFO - Step 18400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:25:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1468
2025-08-30 08:25:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 08:25:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:26:04 - pico-train - INFO - Step 18500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:26:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1310
2025-08-30 08:26:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 08:26:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:26:57 - pico-train - INFO - Step 18600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:26:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1406
2025-08-30 08:26:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 08:26:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:27:51 - pico-train - INFO - Step 18700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:27:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1457
2025-08-30 08:27:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 08:27:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:28:45 - pico-train - INFO - Step 18800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:28:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1176
2025-08-30 08:28:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 08:28:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:29:38 - pico-train - INFO - Step 18900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:29:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1283
2025-08-30 08:29:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 08:29:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:30:32 - pico-train - INFO - Step 19000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:30:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1459
2025-08-30 08:30:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 08:30:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:31:25 - pico-train - INFO - Step 19100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:31:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1400
2025-08-30 08:31:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 08:31:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:32:20 - pico-train - INFO - Step 19200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:32:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1267
2025-08-30 08:32:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 08:32:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:33:12 - pico-train - INFO - Step 19300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:33:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1289
2025-08-30 08:33:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 08:33:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:34:05 - pico-train - INFO - Step 19400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:34:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1261
2025-08-30 08:34:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 08:34:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:34:58 - pico-train - INFO - Step 19500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:34:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1383
2025-08-30 08:34:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 08:34:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:35:50 - pico-train - INFO - Step 19600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:35:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1403
2025-08-30 08:35:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 08:35:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:36:43 - pico-train - INFO - Step 19700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:36:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1261
2025-08-30 08:36:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 08:36:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:37:35 - pico-train - INFO - Step 19800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:37:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1311
2025-08-30 08:37:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 08:37:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:38:27 - pico-train - INFO - Step 19900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:38:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1058
2025-08-30 08:38:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 08:38:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:39:19 - pico-train - INFO - Step 20000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:41:42 - pico-train - INFO - Step 20000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:41:42 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 08:41:44 - pico-train - INFO - Step 20000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:41:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1195
2025-08-30 08:41:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 08:41:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:41:44 - pico-train - INFO - Step 20000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:42:41 - pico-train - INFO - Step 20100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:42:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1018
2025-08-30 08:42:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 08:42:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:43:34 - pico-train - INFO - Step 20200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:43:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1165
2025-08-30 08:43:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 08:43:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:44:25 - pico-train - INFO - Step 20300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:44:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1355
2025-08-30 08:44:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 08:44:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:45:18 - pico-train - INFO - Step 20400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:45:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1157
2025-08-30 08:45:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 08:45:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:46:12 - pico-train - INFO - Step 20500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:46:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1164
2025-08-30 08:46:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 08:46:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:47:04 - pico-train - INFO - Step 20600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:47:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1290
2025-08-30 08:47:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 08:47:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:47:56 - pico-train - INFO - Step 20700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:47:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1012
2025-08-30 08:47:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 08:47:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:48:49 - pico-train - INFO - Step 20800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:48:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1103
2025-08-30 08:48:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 08:48:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:49:42 - pico-train - INFO - Step 20900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:49:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1041
2025-08-30 08:49:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 08:49:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:50:35 - pico-train - INFO - Step 21000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:50:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1055
2025-08-30 08:50:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 08:50:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:51:28 - pico-train - INFO - Step 21100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:51:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0964
2025-08-30 08:51:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 08:51:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:52:20 - pico-train - INFO - Step 21200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:52:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1160
2025-08-30 08:52:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 08:52:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:53:14 - pico-train - INFO - Step 21300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:53:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1048
2025-08-30 08:53:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 08:53:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:54:07 - pico-train - INFO - Step 21400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:54:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0806
2025-08-30 08:54:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 08:54:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:55:00 - pico-train - INFO - Step 21500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:55:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1073
2025-08-30 08:55:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 08:55:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:55:53 - pico-train - INFO - Step 21600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:55:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0786
2025-08-30 08:55:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 08:55:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:56:47 - pico-train - INFO - Step 21700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:56:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0965
2025-08-30 08:56:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 08:56:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:57:41 - pico-train - INFO - Step 21800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:57:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1086
2025-08-30 08:57:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 08:57:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:58:33 - pico-train - INFO - Step 21900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:58:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1030
2025-08-30 08:58:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 08:58:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:59:24 - pico-train - INFO - Step 22000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:01:28 - pico-train - INFO - Step 22000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:01:28 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 09:01:31 - pico-train - INFO - Step 22000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:01:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0792
2025-08-30 09:01:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 09:01:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:01:31 - pico-train - INFO - Step 22000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:02:29 - pico-train - INFO - Step 22100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:02:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0856
2025-08-30 09:02:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 09:02:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:03:21 - pico-train - INFO - Step 22200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:03:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0824
2025-08-30 09:03:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 09:03:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:04:14 - pico-train - INFO - Step 22300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:04:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0941
2025-08-30 09:04:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 09:04:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:05:06 - pico-train - INFO - Step 22400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:05:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0779
2025-08-30 09:05:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 09:05:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:05:58 - pico-train - INFO - Step 22500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:05:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0808
2025-08-30 09:05:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 09:05:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:06:50 - pico-train - INFO - Step 22600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:06:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0870
2025-08-30 09:06:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 09:06:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:07:42 - pico-train - INFO - Step 22700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:07:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0751
2025-08-30 09:07:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 09:07:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:08:35 - pico-train - INFO - Step 22800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:08:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0664
2025-08-30 09:08:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 09:08:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:09:27 - pico-train - INFO - Step 22900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:09:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0716
2025-08-30 09:09:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 09:09:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:10:19 - pico-train - INFO - Step 23000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:10:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0720
2025-08-30 09:10:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 09:10:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:11:12 - pico-train - INFO - Step 23100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:11:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0657
2025-08-30 09:11:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 09:11:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:12:04 - pico-train - INFO - Step 23200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:12:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0631
2025-08-30 09:12:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 09:12:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:12:57 - pico-train - INFO - Step 23300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:12:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0873
2025-08-30 09:12:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 09:12:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:13:48 - pico-train - INFO - Step 23400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:13:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0613
2025-08-30 09:13:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 09:13:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:14:41 - pico-train - INFO - Step 23500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:14:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0463
2025-08-30 09:14:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 09:14:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:15:33 - pico-train - INFO - Step 23600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:15:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0611
2025-08-30 09:15:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 09:15:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:16:26 - pico-train - INFO - Step 23700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:16:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0502
2025-08-30 09:16:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 09:16:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:17:18 - pico-train - INFO - Step 23800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:17:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0518
2025-08-30 09:17:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 09:17:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:18:10 - pico-train - INFO - Step 23900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:18:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0439
2025-08-30 09:18:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 09:18:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:19:02 - pico-train - INFO - Step 24000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:21:02 - pico-train - INFO - Step 24000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:21:02 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 09:21:04 - pico-train - INFO - Step 24000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:21:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0522
2025-08-30 09:21:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 09:21:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:21:04 - pico-train - INFO - Step 24000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:22:01 - pico-train - INFO - Step 24100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:22:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0477
2025-08-30 09:22:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 09:22:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:22:54 - pico-train - INFO - Step 24200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:22:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0433
2025-08-30 09:22:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 09:22:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:23:46 - pico-train - INFO - Step 24300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:23:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0522
2025-08-30 09:23:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 09:23:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:24:39 - pico-train - INFO - Step 24400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:24:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0452
2025-08-30 09:24:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-04
2025-08-30 09:24:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:25:32 - pico-train - INFO - Step 24500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:25:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0629
2025-08-30 09:25:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-04
2025-08-30 09:25:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:26:24 - pico-train - INFO - Step 24600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:26:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0439
2025-08-30 09:26:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-04
2025-08-30 09:26:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:27:16 - pico-train - INFO - Step 24700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:27:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0322
2025-08-30 09:27:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-04
2025-08-30 09:27:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:28:08 - pico-train - INFO - Step 24800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:28:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0504
2025-08-30 09:28:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 09:28:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:29:00 - pico-train - INFO - Step 24900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:29:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0361
2025-08-30 09:29:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 09:29:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:29:52 - pico-train - INFO - Step 25000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:29:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0214
2025-08-30 09:29:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 09:29:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:30:44 - pico-train - INFO - Step 25100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:30:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0437
2025-08-30 09:30:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 09:30:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:31:37 - pico-train - INFO - Step 25200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:31:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0433
2025-08-30 09:31:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 09:31:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:32:28 - pico-train - INFO - Step 25300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:32:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0522
2025-08-30 09:32:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 09:32:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:33:21 - pico-train - INFO - Step 25400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:33:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0399
2025-08-30 09:33:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 09:33:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:34:14 - pico-train - INFO - Step 25500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:34:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0391
2025-08-30 09:34:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 09:34:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:35:07 - pico-train - INFO - Step 25600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:35:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0489
2025-08-30 09:35:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 09:35:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:35:59 - pico-train - INFO - Step 25700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:35:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0378
2025-08-30 09:35:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 09:35:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:36:52 - pico-train - INFO - Step 25800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:36:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0416
2025-08-30 09:36:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-04
2025-08-30 09:36:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:37:44 - pico-train - INFO - Step 25900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:37:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0106
2025-08-30 09:37:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-04
2025-08-30 09:37:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:38:36 - pico-train - INFO - Step 26000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:40:36 - pico-train - INFO - Step 26000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:40:36 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 09:40:37 - pico-train - INFO - Step 26000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:40:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0129
2025-08-30 09:40:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-04
2025-08-30 09:40:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:40:37 - pico-train - INFO - Step 26000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:41:35 - pico-train - INFO - Step 26100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:41:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0560
2025-08-30 09:41:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-04
2025-08-30 09:41:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:42:27 - pico-train - INFO - Step 26200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:42:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0231
2025-08-30 09:42:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-04
2025-08-30 09:42:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:43:20 - pico-train - INFO - Step 26300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:43:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0203
2025-08-30 09:43:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-04
2025-08-30 09:43:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:44:13 - pico-train - INFO - Step 26400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:44:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0435
2025-08-30 09:44:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-04
2025-08-30 09:44:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:45:05 - pico-train - INFO - Step 26500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:45:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0079
2025-08-30 09:45:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-04
2025-08-30 09:45:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:45:59 - pico-train - INFO - Step 26600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:45:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0246
2025-08-30 09:45:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 09:45:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:46:50 - pico-train - INFO - Step 26700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:46:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0400
2025-08-30 09:46:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 09:46:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:47:43 - pico-train - INFO - Step 26800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:47:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0359
2025-08-30 09:47:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 09:47:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:48:36 - pico-train - INFO - Step 26900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:48:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9974
2025-08-30 09:48:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 09:48:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:49:28 - pico-train - INFO - Step 27000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:49:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0186
2025-08-30 09:49:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 09:49:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:50:21 - pico-train - INFO - Step 27100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:50:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0101
2025-08-30 09:50:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-04
2025-08-30 09:50:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:51:13 - pico-train - INFO - Step 27200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:51:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0228
2025-08-30 09:51:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-04
2025-08-30 09:51:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:52:05 - pico-train - INFO - Step 27300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:52:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0295
2025-08-30 09:52:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-04
2025-08-30 09:52:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:52:58 - pico-train - INFO - Step 27400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:52:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0014
2025-08-30 09:52:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-04
2025-08-30 09:52:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:53:50 - pico-train - INFO - Step 27500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:53:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0173
2025-08-30 09:53:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-04
2025-08-30 09:53:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:54:42 - pico-train - INFO - Step 27600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:54:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9924
2025-08-30 09:54:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-04
2025-08-30 09:54:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:55:34 - pico-train - INFO - Step 27700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:55:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0067
2025-08-30 09:55:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-04
2025-08-30 09:55:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:56:26 - pico-train - INFO - Step 27800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:56:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0232
2025-08-30 09:56:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-04
2025-08-30 09:56:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:57:19 - pico-train - INFO - Step 27900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:57:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0190
2025-08-30 09:57:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 09:57:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:58:10 - pico-train - INFO - Step 28000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:00:15 - pico-train - INFO - Step 28000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:00:15 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 10:00:17 - pico-train - INFO - Step 28000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:00:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0087
2025-08-30 10:00:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 10:00:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:00:17 - pico-train - INFO - Step 28000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:01:14 - pico-train - INFO - Step 28100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:01:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9986
2025-08-30 10:01:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 10:01:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:02:07 - pico-train - INFO - Step 28200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:02:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0120
2025-08-30 10:02:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 10:02:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:02:59 - pico-train - INFO - Step 28300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:02:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9886
2025-08-30 10:02:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 10:02:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:03:51 - pico-train - INFO - Step 28400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:03:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0119
2025-08-30 10:03:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-04
2025-08-30 10:03:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:04:44 - pico-train - INFO - Step 28500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:04:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0120
2025-08-30 10:04:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-04
2025-08-30 10:04:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:05:36 - pico-train - INFO - Step 28600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:05:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0072
2025-08-30 10:05:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-04
2025-08-30 10:05:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:06:28 - pico-train - INFO - Step 28700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:06:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0099
2025-08-30 10:06:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-04
2025-08-30 10:06:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:07:21 - pico-train - INFO - Step 28800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:07:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0086
2025-08-30 10:07:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-04
2025-08-30 10:07:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:08:13 - pico-train - INFO - Step 28900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:08:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9908
2025-08-30 10:08:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-04
2025-08-30 10:08:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:09:06 - pico-train - INFO - Step 29000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:09:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9947
2025-08-30 10:09:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-04
2025-08-30 10:09:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:09:59 - pico-train - INFO - Step 29100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:09:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0001
2025-08-30 10:09:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-04
2025-08-30 10:09:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:10:52 - pico-train - INFO - Step 29200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:10:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9991
2025-08-30 10:10:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-04
2025-08-30 10:10:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:11:44 - pico-train - INFO - Step 29300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:11:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9885
2025-08-30 10:11:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-04
2025-08-30 10:11:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:12:36 - pico-train - INFO - Step 29400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:12:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9985
2025-08-30 10:12:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-04
2025-08-30 10:12:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:13:29 - pico-train - INFO - Step 29500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:13:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9928
2025-08-30 10:13:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-04
2025-08-30 10:13:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:14:22 - pico-train - INFO - Step 29600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:14:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0076
2025-08-30 10:14:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-04
2025-08-30 10:14:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:15:16 - pico-train - INFO - Step 29700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:15:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9919
2025-08-30 10:15:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-04
2025-08-30 10:15:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:16:09 - pico-train - INFO - Step 29800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:16:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0125
2025-08-30 10:16:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-04
2025-08-30 10:16:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:17:01 - pico-train - INFO - Step 29900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:17:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9857
2025-08-30 10:17:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-04
2025-08-30 10:17:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:17:53 - pico-train - INFO - Step 30000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:20:02 - pico-train - INFO - Step 30000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:20:02 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 10:20:04 - pico-train - INFO - Step 30000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:20:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9901
2025-08-30 10:20:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-04
2025-08-30 10:20:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:20:04 - pico-train - INFO - Step 30000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:21:01 - pico-train - INFO - Step 30100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:21:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9848
2025-08-30 10:21:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-04
2025-08-30 10:21:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:21:53 - pico-train - INFO - Step 30200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:21:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9702
2025-08-30 10:21:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-04
2025-08-30 10:21:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:22:45 - pico-train - INFO - Step 30300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:22:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9776
2025-08-30 10:22:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-04
2025-08-30 10:22:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:23:37 - pico-train - INFO - Step 30400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:23:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9791
2025-08-30 10:23:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-04
2025-08-30 10:23:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:24:31 - pico-train - INFO - Step 30500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:24:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9795
2025-08-30 10:24:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-04
2025-08-30 10:24:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:25:23 - pico-train - INFO - Step 30600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:25:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9955
2025-08-30 10:25:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-04
2025-08-30 10:25:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:26:15 - pico-train - INFO - Step 30700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:26:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9795
2025-08-30 10:26:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-04
2025-08-30 10:26:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:27:08 - pico-train - INFO - Step 30800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:27:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9634
2025-08-30 10:27:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 10:27:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:28:01 - pico-train - INFO - Step 30900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:28:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9775
2025-08-30 10:28:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 10:28:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:28:54 - pico-train - INFO - Step 31000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:28:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9883
2025-08-30 10:28:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 10:28:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:29:46 - pico-train - INFO - Step 31100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:29:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9492
2025-08-30 10:29:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 10:29:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:30:38 - pico-train - INFO - Step 31200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:30:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9783
2025-08-30 10:30:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-04
2025-08-30 10:30:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:31:30 - pico-train - INFO - Step 31300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:31:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9732
2025-08-30 10:31:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-04
2025-08-30 10:31:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:32:23 - pico-train - INFO - Step 31400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:32:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9639
2025-08-30 10:32:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-04
2025-08-30 10:32:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:33:16 - pico-train - INFO - Step 31500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:33:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9706
2025-08-30 10:33:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-04
2025-08-30 10:33:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:34:09 - pico-train - INFO - Step 31600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:34:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9916
2025-08-30 10:34:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-04
2025-08-30 10:34:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:35:00 - pico-train - INFO - Step 31700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:35:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9708
2025-08-30 10:35:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-04
2025-08-30 10:35:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:35:53 - pico-train - INFO - Step 31800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:35:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9447
2025-08-30 10:35:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-04
2025-08-30 10:35:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:36:46 - pico-train - INFO - Step 31900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:36:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9892
2025-08-30 10:36:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-04
2025-08-30 10:36:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:37:37 - pico-train - INFO - Step 32000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:39:53 - pico-train - INFO - Step 32000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:39:53 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 10:39:55 - pico-train - INFO - Step 32000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:39:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9585
2025-08-30 10:39:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-04
2025-08-30 10:39:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:39:55 - pico-train - INFO - Step 32000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:40:52 - pico-train - INFO - Step 32100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:40:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9840
2025-08-30 10:40:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-04
2025-08-30 10:40:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:41:44 - pico-train - INFO - Step 32200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:41:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9498
2025-08-30 10:41:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-04
2025-08-30 10:41:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:42:37 - pico-train - INFO - Step 32300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:42:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9513
2025-08-30 10:42:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-04
2025-08-30 10:42:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:43:29 - pico-train - INFO - Step 32400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:43:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9563
2025-08-30 10:43:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-04
2025-08-30 10:43:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:44:21 - pico-train - INFO - Step 32500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:44:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9439
2025-08-30 10:44:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-04
2025-08-30 10:44:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:45:13 - pico-train - INFO - Step 32600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:45:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9768
2025-08-30 10:45:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-04
2025-08-30 10:45:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:46:06 - pico-train - INFO - Step 32700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:46:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9640
2025-08-30 10:46:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-04
2025-08-30 10:46:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:46:59 - pico-train - INFO - Step 32800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:46:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9570
2025-08-30 10:46:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-04
2025-08-30 10:46:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:47:51 - pico-train - INFO - Step 32900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:47:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9885
2025-08-30 10:47:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-04
2025-08-30 10:47:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:48:44 - pico-train - INFO - Step 33000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:48:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9528
2025-08-30 10:48:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-04
2025-08-30 10:48:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:49:35 - pico-train - INFO - Step 33100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:49:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9572
2025-08-30 10:49:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-04
2025-08-30 10:49:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:50:28 - pico-train - INFO - Step 33200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:50:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9850
2025-08-30 10:50:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-04
2025-08-30 10:50:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:51:21 - pico-train - INFO - Step 33300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:51:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9591
2025-08-30 10:51:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-04
2025-08-30 10:51:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:52:14 - pico-train - INFO - Step 33400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:52:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9522
2025-08-30 10:52:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-04
2025-08-30 10:52:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:53:08 - pico-train - INFO - Step 33500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:53:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9655
2025-08-30 10:53:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-04
2025-08-30 10:53:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:54:01 - pico-train - INFO - Step 33600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:54:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9481
2025-08-30 10:54:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-04
2025-08-30 10:54:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:54:54 - pico-train - INFO - Step 33700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:54:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9614
2025-08-30 10:54:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-04
2025-08-30 10:54:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:55:48 - pico-train - INFO - Step 33800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:55:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9347
2025-08-30 10:55:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-04
2025-08-30 10:55:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:56:40 - pico-train - INFO - Step 33900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:56:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9518
2025-08-30 10:56:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-04
2025-08-30 10:56:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:57:34 - pico-train - INFO - Step 34000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:59:40 - pico-train - INFO - Step 34000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:59:40 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 10:59:42 - pico-train - INFO - Step 34000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:59:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9519
2025-08-30 10:59:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-04
2025-08-30 10:59:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:59:42 - pico-train - INFO - Step 34000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:00:45 - pico-train - INFO - Step 34100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:00:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9583
2025-08-30 11:00:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-04
2025-08-30 11:00:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:01:38 - pico-train - INFO - Step 34200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:01:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9762
2025-08-30 11:01:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-04
2025-08-30 11:01:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:02:32 - pico-train - INFO - Step 34300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:02:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9671
2025-08-30 11:02:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-04
2025-08-30 11:02:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:03:24 - pico-train - INFO - Step 34400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:03:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9657
2025-08-30 11:03:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-04
2025-08-30 11:03:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:04:18 - pico-train - INFO - Step 34500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:04:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9669
2025-08-30 11:04:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-30 11:04:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:05:13 - pico-train - INFO - Step 34600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:05:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9462
2025-08-30 11:05:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-30 11:05:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:06:05 - pico-train - INFO - Step 34700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:06:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9412
2025-08-30 11:06:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-30 11:06:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:06:58 - pico-train - INFO - Step 34800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:06:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9356
2025-08-30 11:06:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-30 11:06:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:07:52 - pico-train - INFO - Step 34900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:07:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9345
2025-08-30 11:07:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-04
2025-08-30 11:07:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:08:45 - pico-train - INFO - Step 35000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:08:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9402
2025-08-30 11:08:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-04
2025-08-30 11:08:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:09:39 - pico-train - INFO - Step 35100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:09:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9421
2025-08-30 11:09:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-04
2025-08-30 11:09:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:10:33 - pico-train - INFO - Step 35200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:10:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9274
2025-08-30 11:10:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-04
2025-08-30 11:10:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:11:25 - pico-train - INFO - Step 35300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:11:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9680
2025-08-30 11:11:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-04
2025-08-30 11:11:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:12:20 - pico-train - INFO - Step 35400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:12:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9663
2025-08-30 11:12:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-04
2025-08-30 11:12:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:13:13 - pico-train - INFO - Step 35500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:13:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9410
2025-08-30 11:13:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-04
2025-08-30 11:13:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:14:07 - pico-train - INFO - Step 35600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:14:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9500
2025-08-30 11:14:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-04
2025-08-30 11:14:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:15:01 - pico-train - INFO - Step 35700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:15:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9276
2025-08-30 11:15:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-04
2025-08-30 11:15:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:15:54 - pico-train - INFO - Step 35800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:15:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9673
2025-08-30 11:15:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-04
2025-08-30 11:15:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:16:48 - pico-train - INFO - Step 35900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:16:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9396
2025-08-30 11:16:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-04
2025-08-30 11:16:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:17:41 - pico-train - INFO - Step 36000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:19:56 - pico-train - INFO - Step 36000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:19:56 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 11:19:58 - pico-train - INFO - Step 36000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:19:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9613
2025-08-30 11:19:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-04
2025-08-30 11:19:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:19:58 - pico-train - INFO - Step 36000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:20:56 - pico-train - INFO - Step 36100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:20:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9352
2025-08-30 11:20:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-04
2025-08-30 11:20:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:21:47 - pico-train - INFO - Step 36200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:21:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9431
2025-08-30 11:21:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-04
2025-08-30 11:21:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:22:40 - pico-train - INFO - Step 36300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:22:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9406
2025-08-30 11:22:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-04
2025-08-30 11:22:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:23:33 - pico-train - INFO - Step 36400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:23:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9587
2025-08-30 11:23:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-04
2025-08-30 11:23:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:24:26 - pico-train - INFO - Step 36500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:24:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9296
2025-08-30 11:24:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-04
2025-08-30 11:24:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:25:19 - pico-train - INFO - Step 36600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:25:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9252
2025-08-30 11:25:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-04
2025-08-30 11:25:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:26:10 - pico-train - INFO - Step 36700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:26:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9333
2025-08-30 11:26:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-04
2025-08-30 11:26:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:27:03 - pico-train - INFO - Step 36800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:27:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9394
2025-08-30 11:27:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-04
2025-08-30 11:27:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:27:55 - pico-train - INFO - Step 36900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:27:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9517
2025-08-30 11:27:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-04
2025-08-30 11:27:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:28:47 - pico-train - INFO - Step 37000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:28:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9360
2025-08-30 11:28:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-04
2025-08-30 11:28:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:29:39 - pico-train - INFO - Step 37100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:29:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9356
2025-08-30 11:29:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-04
2025-08-30 11:29:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:30:32 - pico-train - INFO - Step 37200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:30:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9186
2025-08-30 11:30:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-04
2025-08-30 11:30:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:31:25 - pico-train - INFO - Step 37300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:31:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9428
2025-08-30 11:31:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-04
2025-08-30 11:31:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:32:17 - pico-train - INFO - Step 37400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:32:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9358
2025-08-30 11:32:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-04
2025-08-30 11:32:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:33:09 - pico-train - INFO - Step 37500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:33:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9206
2025-08-30 11:33:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-04
2025-08-30 11:33:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:34:01 - pico-train - INFO - Step 37600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:34:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9373
2025-08-30 11:34:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-04
2025-08-30 11:34:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:34:53 - pico-train - INFO - Step 37700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:34:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9388
2025-08-30 11:34:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-04
2025-08-30 11:34:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:35:46 - pico-train - INFO - Step 37800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:35:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9301
2025-08-30 11:35:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-04
2025-08-30 11:35:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:36:38 - pico-train - INFO - Step 37900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:36:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9262
2025-08-30 11:36:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-04
2025-08-30 11:36:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:37:30 - pico-train - INFO - Step 38000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:39:29 - pico-train - INFO - Step 38000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:39:29 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 11:39:30 - pico-train - INFO - Step 38000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:39:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9291
2025-08-30 11:39:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-30 11:39:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:39:30 - pico-train - INFO - Step 38000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:40:26 - pico-train - INFO - Step 38100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:40:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9438
2025-08-30 11:40:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-30 11:40:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:41:19 - pico-train - INFO - Step 38200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:41:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9386
2025-08-30 11:41:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-30 11:41:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:42:11 - pico-train - INFO - Step 38300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:42:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9439
2025-08-30 11:42:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-30 11:42:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:43:04 - pico-train - INFO - Step 38400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:43:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9467
2025-08-30 11:43:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-04
2025-08-30 11:43:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:43:56 - pico-train - INFO - Step 38500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:43:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9535
2025-08-30 11:43:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-04
2025-08-30 11:43:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:44:49 - pico-train - INFO - Step 38600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:44:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9062
2025-08-30 11:44:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-04
2025-08-30 11:44:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:45:42 - pico-train - INFO - Step 38700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:45:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9308
2025-08-30 11:45:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-04
2025-08-30 11:45:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:46:35 - pico-train - INFO - Step 38800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:46:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9026
2025-08-30 11:46:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-04
2025-08-30 11:46:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:47:27 - pico-train - INFO - Step 38900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:47:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9223
2025-08-30 11:47:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-04
2025-08-30 11:47:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:48:19 - pico-train - INFO - Step 39000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:48:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9212
2025-08-30 11:48:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-04
2025-08-30 11:48:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:49:11 - pico-train - INFO - Step 39100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:49:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9162
2025-08-30 11:49:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-04
2025-08-30 11:49:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:50:03 - pico-train - INFO - Step 39200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:50:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9188
2025-08-30 11:50:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-04
2025-08-30 11:50:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:50:55 - pico-train - INFO - Step 39300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:50:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9239
2025-08-30 11:50:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-04
2025-08-30 11:50:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:51:48 - pico-train - INFO - Step 39400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:51:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9182
2025-08-30 11:51:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-04
2025-08-30 11:51:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:52:40 - pico-train - INFO - Step 39500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:52:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9126
2025-08-30 11:52:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-04
2025-08-30 11:52:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:53:32 - pico-train - INFO - Step 39600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:53:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9245
2025-08-30 11:53:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-04
2025-08-30 11:53:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:54:25 - pico-train - INFO - Step 39700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:54:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9458
2025-08-30 11:54:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-04
2025-08-30 11:54:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:55:17 - pico-train - INFO - Step 39800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:55:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9233
2025-08-30 11:55:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-04
2025-08-30 11:55:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:56:09 - pico-train - INFO - Step 39900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:56:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9178
2025-08-30 11:56:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-04
2025-08-30 11:56:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:57:01 - pico-train - INFO - Step 40000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:59:02 - pico-train - INFO - Step 40000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:59:02 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 11:59:03 - pico-train - INFO - Step 40000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:59:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9145
2025-08-30 11:59:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-04
2025-08-30 11:59:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:59:03 - pico-train - INFO - Step 40000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 12:00:02 - pico-train - INFO - Step 40100 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:00:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9239
2025-08-30 12:00:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-04
2025-08-30 12:00:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:00:55 - pico-train - INFO - Step 40200 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:00:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9170
2025-08-30 12:00:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-04
2025-08-30 12:00:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:01:48 - pico-train - INFO - Step 40300 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:01:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9209
2025-08-30 12:01:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-04
2025-08-30 12:01:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:02:42 - pico-train - INFO - Step 40400 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:02:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9120
2025-08-30 12:02:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-04
2025-08-30 12:02:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:03:35 - pico-train - INFO - Step 40500 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:03:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9072
2025-08-30 12:03:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-04
2025-08-30 12:03:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:04:29 - pico-train - INFO - Step 40600 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:04:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9122
2025-08-30 12:04:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-04
2025-08-30 12:04:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:05:23 - pico-train - INFO - Step 40700 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:05:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9219
2025-08-30 12:05:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-04
2025-08-30 12:05:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:06:15 - pico-train - INFO - Step 40800 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:06:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8843
2025-08-30 12:06:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-04
2025-08-30 12:06:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:07:09 - pico-train - INFO - Step 40900 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:07:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9325
2025-08-30 12:07:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-04
2025-08-30 12:07:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:08:04 - pico-train - INFO - Step 41000 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:08:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8745
2025-08-30 12:08:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-04
2025-08-30 12:08:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:08:57 - pico-train - INFO - Step 41100 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:08:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9077
2025-08-30 12:08:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-04
2025-08-30 12:08:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:09:50 - pico-train - INFO - Step 41200 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:09:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9190
2025-08-30 12:09:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-04
2025-08-30 12:09:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:10:43 - pico-train - INFO - Step 41300 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:10:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9054
2025-08-30 12:10:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-04
2025-08-30 12:10:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:11:36 - pico-train - INFO - Step 41400 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:11:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8996
2025-08-30 12:11:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-04
2025-08-30 12:11:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:12:30 - pico-train - INFO - Step 41500 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:12:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9230
2025-08-30 12:12:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-04
2025-08-30 12:12:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:13:25 - pico-train - INFO - Step 41600 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:13:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9031
2025-08-30 12:13:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-04
2025-08-30 12:13:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:14:17 - pico-train - INFO - Step 41700 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:14:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9136
2025-08-30 12:14:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-04
2025-08-30 12:14:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:15:11 - pico-train - INFO - Step 41800 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:15:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9200
2025-08-30 12:15:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-04
2025-08-30 12:15:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:16:04 - pico-train - INFO - Step 41900 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:16:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8982
2025-08-30 12:16:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-04
2025-08-30 12:16:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:16:58 - pico-train - INFO - Step 42000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 12:19:10 - pico-train - INFO - Step 42000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 12:19:10 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 12:19:12 - pico-train - INFO - Step 42000 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:19:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8824
2025-08-30 12:19:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-04
2025-08-30 12:19:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:19:12 - pico-train - INFO - Step 42000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 12:20:09 - pico-train - INFO - Step 42100 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:20:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8966
2025-08-30 12:20:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-04
2025-08-30 12:20:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:21:01 - pico-train - INFO - Step 42200 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:21:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8971
2025-08-30 12:21:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-04
2025-08-30 12:21:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:21:55 - pico-train - INFO - Step 42300 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:21:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9271
2025-08-30 12:21:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-04
2025-08-30 12:21:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:22:47 - pico-train - INFO - Step 42400 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:22:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9046
2025-08-30 12:22:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-04
2025-08-30 12:22:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:23:40 - pico-train - INFO - Step 42500 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:23:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9010
2025-08-30 12:23:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-04
2025-08-30 12:23:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:24:42 - pico-train - INFO - Step 42600 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:24:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9314
2025-08-30 12:24:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-04
2025-08-30 12:24:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:25:54 - pico-train - INFO - Step 42700 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:25:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8923
2025-08-30 12:25:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-04
2025-08-30 12:25:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:26:58 - pico-train - INFO - Step 42800 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:26:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9007
2025-08-30 12:26:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-04
2025-08-30 12:26:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:27:52 - pico-train - INFO - Step 42900 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:27:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8903
2025-08-30 12:27:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-04
2025-08-30 12:27:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:28:46 - pico-train - INFO - Step 43000 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:28:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9252
2025-08-30 12:28:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-04
2025-08-30 12:28:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:29:38 - pico-train - INFO - Step 43100 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:29:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8924
2025-08-30 12:29:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-04
2025-08-30 12:29:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:30:32 - pico-train - INFO - Step 43200 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:30:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8955
2025-08-30 12:30:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-04
2025-08-30 12:30:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:31:26 - pico-train - INFO - Step 43300 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:31:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8651
2025-08-30 12:31:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-04
2025-08-30 12:31:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:32:20 - pico-train - INFO - Step 43400 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:32:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9018
2025-08-30 12:32:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-04
2025-08-30 12:32:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:33:13 - pico-train - INFO - Step 43500 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:33:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9001
2025-08-30 12:33:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-04
2025-08-30 12:33:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:34:07 - pico-train - INFO - Step 43600 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:34:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8980
2025-08-30 12:34:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-04
2025-08-30 12:34:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:35:01 - pico-train - INFO - Step 43700 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:35:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9205
2025-08-30 12:35:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-04
2025-08-30 12:35:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:35:55 - pico-train - INFO - Step 43800 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:35:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8935
2025-08-30 12:35:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-04
2025-08-30 12:35:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:36:48 - pico-train - INFO - Step 43900 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:36:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8905
2025-08-30 12:36:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-04
2025-08-30 12:36:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:37:41 - pico-train - INFO - Step 44000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 12:39:45 - pico-train - INFO - Step 44000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 12:39:45 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 12:39:47 - pico-train - INFO - Step 44000 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:39:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9013
2025-08-30 12:39:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-04
2025-08-30 12:39:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:39:47 - pico-train - INFO - Step 44000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 12:40:45 - pico-train - INFO - Step 44100 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:40:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8915
2025-08-30 12:40:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-04
2025-08-30 12:40:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:41:39 - pico-train - INFO - Step 44200 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:41:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8805
2025-08-30 12:41:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-04
2025-08-30 12:41:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:42:32 - pico-train - INFO - Step 44300 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:42:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8928
2025-08-30 12:42:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-04
2025-08-30 12:42:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:43:26 - pico-train - INFO - Step 44400 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:43:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8799
2025-08-30 12:43:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-04
2025-08-30 12:43:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:44:19 - pico-train - INFO - Step 44500 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:44:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9167
2025-08-30 12:44:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-04
2025-08-30 12:44:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:45:12 - pico-train - INFO - Step 44600 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:45:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8424
2025-08-30 12:45:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-04
2025-08-30 12:45:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:46:05 - pico-train - INFO - Step 44700 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:46:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8779
2025-08-30 12:46:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-04
2025-08-30 12:46:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:47:00 - pico-train - INFO - Step 44800 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:47:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9088
2025-08-30 12:47:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-04
2025-08-30 12:47:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:47:52 - pico-train - INFO - Step 44900 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:47:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9030
2025-08-30 12:47:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-04
2025-08-30 12:47:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:48:44 - pico-train - INFO - Step 45000 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:48:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8993
2025-08-30 12:48:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-04
2025-08-30 12:48:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:49:37 - pico-train - INFO - Step 45100 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:49:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8970
2025-08-30 12:49:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-04
2025-08-30 12:49:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:50:30 - pico-train - INFO - Step 45200 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:50:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8831
2025-08-30 12:50:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-04
2025-08-30 12:50:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:51:23 - pico-train - INFO - Step 45300 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:51:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8729
2025-08-30 12:51:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-04
2025-08-30 12:51:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:52:16 - pico-train - INFO - Step 45400 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:52:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8859
2025-08-30 12:52:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-04
2025-08-30 12:52:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:53:09 - pico-train - INFO - Step 45500 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:53:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9129
2025-08-30 12:53:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-04
2025-08-30 12:53:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:54:03 - pico-train - INFO - Step 45600 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:54:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8550
2025-08-30 12:54:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-04
2025-08-30 12:54:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:54:56 - pico-train - INFO - Step 45700 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:54:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8901
2025-08-30 12:54:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-04
2025-08-30 12:54:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:55:50 - pico-train - INFO - Step 45800 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:55:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8900
2025-08-30 12:55:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-04
2025-08-30 12:55:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:56:41 - pico-train - INFO - Step 45900 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:56:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8725
2025-08-30 12:56:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-04
2025-08-30 12:56:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:57:33 - pico-train - INFO - Step 46000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 12:59:41 - pico-train - INFO - Step 46000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 12:59:41 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 12:59:43 - pico-train - INFO - Step 46000 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:59:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8772
2025-08-30 12:59:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-04
2025-08-30 12:59:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:59:43 - pico-train - INFO - Step 46000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 13:00:43 - pico-train - INFO - Step 46100 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:00:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8649
2025-08-30 13:00:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-04
2025-08-30 13:00:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:01:35 - pico-train - INFO - Step 46200 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:01:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8980
2025-08-30 13:01:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-04
2025-08-30 13:01:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:02:29 - pico-train - INFO - Step 46300 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:02:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8867
2025-08-30 13:02:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-04
2025-08-30 13:02:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:03:22 - pico-train - INFO - Step 46400 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:03:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8807
2025-08-30 13:03:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-04
2025-08-30 13:03:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:04:15 - pico-train - INFO - Step 46500 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:04:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8779
2025-08-30 13:04:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-04
2025-08-30 13:04:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:05:09 - pico-train - INFO - Step 46600 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:05:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8908
2025-08-30 13:05:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-04
2025-08-30 13:05:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:06:01 - pico-train - INFO - Step 46700 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:06:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8882
2025-08-30 13:06:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-04
2025-08-30 13:06:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:06:55 - pico-train - INFO - Step 46800 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:06:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8877
2025-08-30 13:06:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-04
2025-08-30 13:06:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:07:49 - pico-train - INFO - Step 46900 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:07:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8686
2025-08-30 13:07:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-04
2025-08-30 13:07:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:08:42 - pico-train - INFO - Step 47000 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:08:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8701
2025-08-30 13:08:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-04
2025-08-30 13:08:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:09:35 - pico-train - INFO - Step 47100 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:09:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8670
2025-08-30 13:09:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-04
2025-08-30 13:09:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:10:28 - pico-train - INFO - Step 47200 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:10:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8849
2025-08-30 13:10:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-04
2025-08-30 13:10:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:11:22 - pico-train - INFO - Step 47300 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:11:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8665
2025-08-30 13:11:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-04
2025-08-30 13:11:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:12:16 - pico-train - INFO - Step 47400 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:12:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8595
2025-08-30 13:12:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-04
2025-08-30 13:12:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:13:08 - pico-train - INFO - Step 47500 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:13:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8680
2025-08-30 13:13:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-04
2025-08-30 13:13:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:14:00 - pico-train - INFO - Step 47600 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:14:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8867
2025-08-30 13:14:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-04
2025-08-30 13:14:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:14:52 - pico-train - INFO - Step 47700 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:14:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8761
2025-08-30 13:14:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-04
2025-08-30 13:14:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:15:44 - pico-train - INFO - Step 47800 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:15:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8965
2025-08-30 13:15:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-04
2025-08-30 13:15:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:16:36 - pico-train - INFO - Step 47900 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:16:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8890
2025-08-30 13:16:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-04
2025-08-30 13:16:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:17:28 - pico-train - INFO - Step 48000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 13:19:28 - pico-train - INFO - Step 48000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 13:19:28 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 13:19:29 - pico-train - INFO - Step 48000 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:19:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8801
2025-08-30 13:19:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-04
2025-08-30 13:19:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:19:29 - pico-train - INFO - Step 48000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 13:20:26 - pico-train - INFO - Step 48100 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:20:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8751
2025-08-30 13:20:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-04
2025-08-30 13:20:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:21:18 - pico-train - INFO - Step 48200 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:21:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8703
2025-08-30 13:21:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-04
2025-08-30 13:21:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:22:10 - pico-train - INFO - Step 48300 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:22:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8749
2025-08-30 13:22:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-04
2025-08-30 13:22:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:23:03 - pico-train - INFO - Step 48400 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:23:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8782
2025-08-30 13:23:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-04
2025-08-30 13:23:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:23:55 - pico-train - INFO - Step 48500 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:23:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8676
2025-08-30 13:23:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-04
2025-08-30 13:23:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:24:46 - pico-train - INFO - Step 48600 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:24:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8610
2025-08-30 13:24:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-04
2025-08-30 13:24:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:25:40 - pico-train - INFO - Step 48700 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:25:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8672
2025-08-30 13:25:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-04
2025-08-30 13:25:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:26:32 - pico-train - INFO - Step 48800 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:26:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8771
2025-08-30 13:26:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-04
2025-08-30 13:26:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:27:24 - pico-train - INFO - Step 48900 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:27:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8739
2025-08-30 13:27:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-04
2025-08-30 13:27:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:28:16 - pico-train - INFO - Step 49000 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:28:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8754
2025-08-30 13:28:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-04
2025-08-30 13:28:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:29:08 - pico-train - INFO - Step 49100 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:29:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8521
2025-08-30 13:29:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-04
2025-08-30 13:29:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:30:01 - pico-train - INFO - Step 49200 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:30:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8698
2025-08-30 13:30:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-04
2025-08-30 13:30:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:30:53 - pico-train - INFO - Step 49300 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:30:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8818
2025-08-30 13:30:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-04
2025-08-30 13:30:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:31:45 - pico-train - INFO - Step 49400 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:31:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8477
2025-08-30 13:31:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-04
2025-08-30 13:31:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:32:37 - pico-train - INFO - Step 49500 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:32:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8804
2025-08-30 13:32:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-04
2025-08-30 13:32:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:33:29 - pico-train - INFO - Step 49600 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:33:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8608
2025-08-30 13:33:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-04
2025-08-30 13:33:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:34:21 - pico-train - INFO - Step 49700 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:34:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8419
2025-08-30 13:34:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-04
2025-08-30 13:34:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:35:14 - pico-train - INFO - Step 49800 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:35:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8767
2025-08-30 13:35:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-04
2025-08-30 13:35:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:36:07 - pico-train - INFO - Step 49900 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:36:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8593
2025-08-30 13:36:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-04
2025-08-30 13:36:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:36:59 - pico-train - INFO - Step 50000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 13:39:13 - pico-train - INFO - Step 50000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 13:39:13 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 13:39:14 - pico-train - INFO - Step 50000 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:39:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8786
2025-08-30 13:39:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-04
2025-08-30 13:39:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:39:14 - pico-train - INFO - Step 50000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 13:40:12 - pico-train - INFO - Step 50100 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:40:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8819
2025-08-30 13:40:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-04
2025-08-30 13:40:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:41:06 - pico-train - INFO - Step 50200 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:41:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8675
2025-08-30 13:41:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-04
2025-08-30 13:41:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:41:59 - pico-train - INFO - Step 50300 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:41:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8522
2025-08-30 13:41:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-04
2025-08-30 13:41:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:42:52 - pico-train - INFO - Step 50400 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:42:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8488
2025-08-30 13:42:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-04
2025-08-30 13:42:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:43:46 - pico-train - INFO - Step 50500 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:43:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8619
2025-08-30 13:43:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-04
2025-08-30 13:43:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:44:39 - pico-train - INFO - Step 50600 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:44:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8661
2025-08-30 13:44:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-04
2025-08-30 13:44:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:45:33 - pico-train - INFO - Step 50700 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:45:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8595
2025-08-30 13:45:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-04
2025-08-30 13:45:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:46:27 - pico-train - INFO - Step 50800 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:46:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8543
2025-08-30 13:46:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-04
2025-08-30 13:46:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:47:19 - pico-train - INFO - Step 50900 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:47:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8711
2025-08-30 13:47:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-04
2025-08-30 13:47:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:48:13 - pico-train - INFO - Step 51000 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:48:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8639
2025-08-30 13:48:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-04
2025-08-30 13:48:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:49:06 - pico-train - INFO - Step 51100 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:49:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8527
2025-08-30 13:49:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.97e-05
2025-08-30 13:49:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:50:01 - pico-train - INFO - Step 51200 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:50:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8604
2025-08-30 13:50:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.94e-05
2025-08-30 13:50:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:50:54 - pico-train - INFO - Step 51300 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:50:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8577
2025-08-30 13:50:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.90e-05
2025-08-30 13:50:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:51:47 - pico-train - INFO - Step 51400 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:51:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8589
2025-08-30 13:51:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.87e-05
2025-08-30 13:51:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:52:40 - pico-train - INFO - Step 51500 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:52:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8714
2025-08-30 13:52:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.84e-05
2025-08-30 13:52:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:53:34 - pico-train - INFO - Step 51600 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:53:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8822
2025-08-30 13:53:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.81e-05
2025-08-30 13:53:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:54:27 - pico-train - INFO - Step 51700 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:54:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8642
2025-08-30 13:54:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.78e-05
2025-08-30 13:54:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:55:20 - pico-train - INFO - Step 51800 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:55:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8631
2025-08-30 13:55:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.74e-05
2025-08-30 13:55:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:56:13 - pico-train - INFO - Step 51900 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:56:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8383
2025-08-30 13:56:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.71e-05
2025-08-30 13:56:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:57:06 - pico-train - INFO - Step 52000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 13:59:17 - pico-train - INFO - Step 52000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 13:59:17 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 13:59:18 - pico-train - INFO - Step 52000 -- ๐Ÿ”„ Training Metrics
2025-08-30 13:59:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8496
2025-08-30 13:59:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.68e-05
2025-08-30 13:59:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 13:59:18 - pico-train - INFO - Step 52000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 14:00:16 - pico-train - INFO - Step 52100 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:00:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8515
2025-08-30 14:00:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.65e-05
2025-08-30 14:00:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:01:10 - pico-train - INFO - Step 52200 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:01:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8598
2025-08-30 14:01:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.62e-05
2025-08-30 14:01:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:02:03 - pico-train - INFO - Step 52300 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:02:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8696
2025-08-30 14:02:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.58e-05
2025-08-30 14:02:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:02:56 - pico-train - INFO - Step 52400 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:02:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8359
2025-08-30 14:02:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.55e-05
2025-08-30 14:02:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:03:51 - pico-train - INFO - Step 52500 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:03:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8444
2025-08-30 14:03:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.52e-05
2025-08-30 14:03:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:04:44 - pico-train - INFO - Step 52600 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:04:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8626
2025-08-30 14:04:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.49e-05
2025-08-30 14:04:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:05:37 - pico-train - INFO - Step 52700 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:05:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8555
2025-08-30 14:05:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.46e-05
2025-08-30 14:05:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:06:31 - pico-train - INFO - Step 52800 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:06:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8361
2025-08-30 14:06:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.42e-05
2025-08-30 14:06:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:07:24 - pico-train - INFO - Step 52900 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:07:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8518
2025-08-30 14:07:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.39e-05
2025-08-30 14:07:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:08:17 - pico-train - INFO - Step 53000 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:08:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8508
2025-08-30 14:08:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.36e-05
2025-08-30 14:08:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:09:10 - pico-train - INFO - Step 53100 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:09:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8585
2025-08-30 14:09:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.33e-05
2025-08-30 14:09:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:10:03 - pico-train - INFO - Step 53200 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:10:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8520
2025-08-30 14:10:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.30e-05
2025-08-30 14:10:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:10:57 - pico-train - INFO - Step 53300 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:10:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8462
2025-08-30 14:10:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.26e-05
2025-08-30 14:10:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:11:50 - pico-train - INFO - Step 53400 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:11:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8443
2025-08-30 14:11:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.23e-05
2025-08-30 14:11:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:12:44 - pico-train - INFO - Step 53500 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:12:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8567
2025-08-30 14:12:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.20e-05
2025-08-30 14:12:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:13:36 - pico-train - INFO - Step 53600 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:13:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8256
2025-08-30 14:13:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.17e-05
2025-08-30 14:13:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:14:30 - pico-train - INFO - Step 53700 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:14:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8237
2025-08-30 14:14:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.14e-05
2025-08-30 14:14:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:15:24 - pico-train - INFO - Step 53800 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:15:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8332
2025-08-30 14:15:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.10e-05
2025-08-30 14:15:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:16:17 - pico-train - INFO - Step 53900 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:16:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8663
2025-08-30 14:16:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.07e-05
2025-08-30 14:16:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:17:08 - pico-train - INFO - Step 54000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 14:19:06 - pico-train - INFO - Step 54000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 14:19:06 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 14:19:08 - pico-train - INFO - Step 54000 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:19:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8621
2025-08-30 14:19:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.04e-05
2025-08-30 14:19:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:19:08 - pico-train - INFO - Step 54000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 14:20:05 - pico-train - INFO - Step 54100 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:20:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8374
2025-08-30 14:20:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.01e-05
2025-08-30 14:20:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:20:59 - pico-train - INFO - Step 54200 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:20:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8494
2025-08-30 14:20:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.98e-05
2025-08-30 14:20:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:21:52 - pico-train - INFO - Step 54300 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:21:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8265
2025-08-30 14:21:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.94e-05
2025-08-30 14:21:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:22:45 - pico-train - INFO - Step 54400 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:22:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8626
2025-08-30 14:22:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.91e-05
2025-08-30 14:22:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:23:38 - pico-train - INFO - Step 54500 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:23:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8459
2025-08-30 14:23:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.88e-05
2025-08-30 14:23:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:24:32 - pico-train - INFO - Step 54600 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:24:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8332
2025-08-30 14:24:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.85e-05
2025-08-30 14:24:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:25:25 - pico-train - INFO - Step 54700 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:25:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8507
2025-08-30 14:25:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.82e-05
2025-08-30 14:25:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:26:19 - pico-train - INFO - Step 54800 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:26:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8588
2025-08-30 14:26:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.78e-05
2025-08-30 14:26:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:27:12 - pico-train - INFO - Step 54900 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:27:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8781
2025-08-30 14:27:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.75e-05
2025-08-30 14:27:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:28:05 - pico-train - INFO - Step 55000 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:28:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8365
2025-08-30 14:28:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.72e-05
2025-08-30 14:28:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:29:00 - pico-train - INFO - Step 55100 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:29:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8395
2025-08-30 14:29:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.69e-05
2025-08-30 14:29:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:29:53 - pico-train - INFO - Step 55200 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:29:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8306
2025-08-30 14:29:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.66e-05
2025-08-30 14:29:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:30:47 - pico-train - INFO - Step 55300 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:30:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8421
2025-08-30 14:30:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.63e-05
2025-08-30 14:30:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:31:40 - pico-train - INFO - Step 55400 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:31:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8538
2025-08-30 14:31:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.59e-05
2025-08-30 14:31:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:32:33 - pico-train - INFO - Step 55500 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:32:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8317
2025-08-30 14:32:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.56e-05
2025-08-30 14:32:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:33:27 - pico-train - INFO - Step 55600 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:33:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8330
2025-08-30 14:33:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.53e-05
2025-08-30 14:33:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:34:21 - pico-train - INFO - Step 55700 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:34:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8138
2025-08-30 14:34:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.50e-05
2025-08-30 14:34:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:35:14 - pico-train - INFO - Step 55800 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:35:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8473
2025-08-30 14:35:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.47e-05
2025-08-30 14:35:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:36:05 - pico-train - INFO - Step 55900 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:36:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8469
2025-08-30 14:36:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.44e-05
2025-08-30 14:36:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:36:57 - pico-train - INFO - Step 56000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 14:39:13 - pico-train - INFO - Step 56000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 14:39:13 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 14:39:15 - pico-train - INFO - Step 56000 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:39:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8561
2025-08-30 14:39:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.40e-05
2025-08-30 14:39:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:39:15 - pico-train - INFO - Step 56000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 14:40:12 - pico-train - INFO - Step 56100 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:40:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8491
2025-08-30 14:40:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.37e-05
2025-08-30 14:40:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:41:05 - pico-train - INFO - Step 56200 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:41:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8459
2025-08-30 14:41:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.34e-05
2025-08-30 14:41:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:41:57 - pico-train - INFO - Step 56300 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:41:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8345
2025-08-30 14:41:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.31e-05
2025-08-30 14:41:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:42:50 - pico-train - INFO - Step 56400 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:42:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8510
2025-08-30 14:42:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.28e-05
2025-08-30 14:42:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:43:43 - pico-train - INFO - Step 56500 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:43:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8172
2025-08-30 14:43:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.25e-05
2025-08-30 14:43:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:44:35 - pico-train - INFO - Step 56600 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:44:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8297
2025-08-30 14:44:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.21e-05
2025-08-30 14:44:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:45:27 - pico-train - INFO - Step 56700 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:45:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8399
2025-08-30 14:45:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.18e-05
2025-08-30 14:45:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:46:19 - pico-train - INFO - Step 56800 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:46:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8370
2025-08-30 14:46:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.15e-05
2025-08-30 14:46:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:47:11 - pico-train - INFO - Step 56900 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:47:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8458
2025-08-30 14:47:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.12e-05
2025-08-30 14:47:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:48:04 - pico-train - INFO - Step 57000 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:48:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8466
2025-08-30 14:48:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.09e-05
2025-08-30 14:48:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:48:57 - pico-train - INFO - Step 57100 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:48:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8173
2025-08-30 14:48:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.06e-05
2025-08-30 14:48:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:49:49 - pico-train - INFO - Step 57200 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:49:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8302
2025-08-30 14:49:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.03e-05
2025-08-30 14:49:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:50:41 - pico-train - INFO - Step 57300 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:50:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8262
2025-08-30 14:50:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.99e-05
2025-08-30 14:50:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:51:33 - pico-train - INFO - Step 57400 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:51:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8268
2025-08-30 14:51:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.96e-05
2025-08-30 14:51:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:52:25 - pico-train - INFO - Step 57500 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:52:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8415
2025-08-30 14:52:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.93e-05
2025-08-30 14:52:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:53:19 - pico-train - INFO - Step 57600 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:53:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8350
2025-08-30 14:53:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.90e-05
2025-08-30 14:53:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:54:11 - pico-train - INFO - Step 57700 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:54:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8597
2025-08-30 14:54:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.87e-05
2025-08-30 14:54:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:55:03 - pico-train - INFO - Step 57800 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:55:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8310
2025-08-30 14:55:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.84e-05
2025-08-30 14:55:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:55:55 - pico-train - INFO - Step 57900 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:55:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8333
2025-08-30 14:55:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.81e-05
2025-08-30 14:55:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:56:47 - pico-train - INFO - Step 58000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 14:58:46 - pico-train - INFO - Step 58000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 14:58:46 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 14:58:48 - pico-train - INFO - Step 58000 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:58:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8290
2025-08-30 14:58:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.77e-05
2025-08-30 14:58:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 14:58:48 - pico-train - INFO - Step 58000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 14:59:46 - pico-train - INFO - Step 58100 -- ๐Ÿ”„ Training Metrics
2025-08-30 14:59:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8301
2025-08-30 14:59:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.74e-05
2025-08-30 14:59:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:00:39 - pico-train - INFO - Step 58200 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:00:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8193
2025-08-30 15:00:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.71e-05
2025-08-30 15:00:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:01:32 - pico-train - INFO - Step 58300 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:01:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8361
2025-08-30 15:01:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.68e-05
2025-08-30 15:01:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:02:25 - pico-train - INFO - Step 58400 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:02:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8375
2025-08-30 15:02:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.65e-05
2025-08-30 15:02:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:03:19 - pico-train - INFO - Step 58500 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:03:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8183
2025-08-30 15:03:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.62e-05
2025-08-30 15:03:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:04:13 - pico-train - INFO - Step 58600 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:04:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8259
2025-08-30 15:04:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.59e-05
2025-08-30 15:04:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:05:05 - pico-train - INFO - Step 58700 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:05:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8395
2025-08-30 15:05:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.56e-05
2025-08-30 15:05:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:05:59 - pico-train - INFO - Step 58800 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:05:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8104
2025-08-30 15:05:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.53e-05
2025-08-30 15:05:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:06:53 - pico-train - INFO - Step 58900 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:06:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8455
2025-08-30 15:06:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.49e-05
2025-08-30 15:06:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:07:46 - pico-train - INFO - Step 59000 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:07:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8379
2025-08-30 15:07:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.46e-05
2025-08-30 15:07:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:08:39 - pico-train - INFO - Step 59100 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:08:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8267
2025-08-30 15:08:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.43e-05
2025-08-30 15:08:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:09:32 - pico-train - INFO - Step 59200 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:09:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8410
2025-08-30 15:09:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.40e-05
2025-08-30 15:09:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:10:26 - pico-train - INFO - Step 59300 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:10:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8488
2025-08-30 15:10:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.37e-05
2025-08-30 15:10:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:11:19 - pico-train - INFO - Step 59400 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:11:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8227
2025-08-30 15:11:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.34e-05
2025-08-30 15:11:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:12:12 - pico-train - INFO - Step 59500 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:12:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8042
2025-08-30 15:12:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.31e-05
2025-08-30 15:12:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:13:05 - pico-train - INFO - Step 59600 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:13:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8441
2025-08-30 15:13:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.28e-05
2025-08-30 15:13:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:13:58 - pico-train - INFO - Step 59700 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:13:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8590
2025-08-30 15:13:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.25e-05
2025-08-30 15:13:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:14:52 - pico-train - INFO - Step 59800 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:14:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8339
2025-08-30 15:14:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.22e-05
2025-08-30 15:14:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:15:45 - pico-train - INFO - Step 59900 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:15:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8216
2025-08-30 15:15:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.19e-05
2025-08-30 15:15:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:16:38 - pico-train - INFO - Step 60000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 15:18:54 - pico-train - INFO - Step 60000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 15:18:54 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 15:18:56 - pico-train - INFO - Step 60000 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:18:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8345
2025-08-30 15:18:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.15e-05
2025-08-30 15:18:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:18:56 - pico-train - INFO - Step 60000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 15:19:55 - pico-train - INFO - Step 60100 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:19:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8207
2025-08-30 15:19:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.12e-05
2025-08-30 15:19:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:20:49 - pico-train - INFO - Step 60200 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:20:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8181
2025-08-30 15:20:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.09e-05
2025-08-30 15:20:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:21:41 - pico-train - INFO - Step 60300 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:21:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8059
2025-08-30 15:21:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.06e-05
2025-08-30 15:21:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:22:33 - pico-train - INFO - Step 60400 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:22:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8367
2025-08-30 15:22:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.03e-05
2025-08-30 15:22:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:23:25 - pico-train - INFO - Step 60500 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:23:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8237
2025-08-30 15:23:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.00e-05
2025-08-30 15:23:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:24:18 - pico-train - INFO - Step 60600 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:24:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8291
2025-08-30 15:24:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.97e-05
2025-08-30 15:24:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:25:10 - pico-train - INFO - Step 60700 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:25:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8317
2025-08-30 15:25:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.94e-05
2025-08-30 15:25:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:26:03 - pico-train - INFO - Step 60800 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:26:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8204
2025-08-30 15:26:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.91e-05
2025-08-30 15:26:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:26:54 - pico-train - INFO - Step 60900 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:26:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8455
2025-08-30 15:26:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.88e-05
2025-08-30 15:26:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:27:47 - pico-train - INFO - Step 61000 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:27:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8133
2025-08-30 15:27:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.85e-05
2025-08-30 15:27:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:28:39 - pico-train - INFO - Step 61100 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:28:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8155
2025-08-30 15:28:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.82e-05
2025-08-30 15:28:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:29:32 - pico-train - INFO - Step 61200 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:29:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8151
2025-08-30 15:29:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.79e-05
2025-08-30 15:29:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:30:24 - pico-train - INFO - Step 61300 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:30:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8111
2025-08-30 15:30:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.76e-05
2025-08-30 15:30:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:31:16 - pico-train - INFO - Step 61400 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:31:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8221
2025-08-30 15:31:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.73e-05
2025-08-30 15:31:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:32:09 - pico-train - INFO - Step 61500 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:32:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8183
2025-08-30 15:32:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.70e-05
2025-08-30 15:32:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:33:02 - pico-train - INFO - Step 61600 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:33:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8133
2025-08-30 15:33:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.67e-05
2025-08-30 15:33:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:33:54 - pico-train - INFO - Step 61700 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:33:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8242
2025-08-30 15:33:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.64e-05
2025-08-30 15:33:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:34:46 - pico-train - INFO - Step 61800 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:34:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8117
2025-08-30 15:34:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.61e-05
2025-08-30 15:34:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:35:38 - pico-train - INFO - Step 61900 -- ๐Ÿ”„ Training Metrics
2025-08-30 15:35:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8329
2025-08-30 15:35:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.58e-05
2025-08-30 15:35:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 15:36:30 - pico-train - INFO - Step 62000 -- ๐Ÿ’พ Saving Checkpoint