mbta-track-predictor / training_report.md
cubis's picture
Upload folder using huggingface_hub
3fef4cf verified

Training Report - Ensemble

Generated: 2025-09-08 18:12:05

## Overview
- **Command**: `ensemble`
- **Training Duration**: 6144.87 seconds (102.4 minutes)
- **Output Directory**: `output/ensemble_20250908_162940`

## Dataset Information
- **Total Records**: 25,512
- **Training Steps per Epoch**: 637
- **Validation Steps per Epoch**: 159

### Vocabulary Sizes
- **Stations**: 6 unique stations
- **Routes**: 13 unique routes  
- **Tracks**: 13 unique tracks (prediction targets)

## Training Configuration
- **Num Models**: 6
  • Epochs: 1000
  • Batch Size: 32
  • Base Learning Rate: 0.001
  • Dataset Size: 25512
  • Bagging Fraction: 1.0
  • Seed Base: 42

Final Performance Metrics

  • Average Validation Loss: 0.9233
  • Average Validation Accuracy: 0.7460
  • Best Individual Accuracy: 0.7720
  • Worst Individual Accuracy: 0.7256
  • Ensemble Std Accuracy: 0.0181

Additional Information

  • Individual Model Metrics: {'model_index': 0, 'validation_loss': 0.8818949460983276, 'validation_accuracy': 0.7720125913619995, 'learning_rate': 0.0011677725896206085, 'parameters': 53384}, {'model_index': 1, 'validation_loss': 0.9238271713256836, 'validation_accuracy': 0.7682783007621765, 'learning_rate': 0.0010442193817105285, 'parameters': 156552}, {'model_index': 2, 'validation_loss': 0.9241461753845215, 'validation_accuracy': 0.7399764060974121, 'learning_rate': 0.00096484374873539, 'parameters': 14856}, {'model_index': 3, 'validation_loss': 0.9481979608535767, 'validation_accuracy': 0.7256289124488831, 'learning_rate': 0.0009162122256532111, 'parameters': 14856}, {'model_index': 4, 'validation_loss': 0.9160543084144592, 'validation_accuracy': 0.7421383857727051, 'learning_rate': 0.0008199605784692232, 'parameters': 14856}, {'model_index': 5, 'validation_loss': 0.9454122185707092, 'validation_accuracy': 0.7279874086380005, 'learning_rate': 0.0009672535936195217, 'parameters': 14856}
  • Ensemble Strategy: Diverse architectures (deep, wide, standard)
  • Learning Rate Variation: 0.8x to 1.2x base rate with random variation
  • Total Parameters: 269360

Temperature Scaling

  • Temperature: 1.5000

  • Uncalibrated Nll: 1.9108

  • Calibrated Nll: 1.8276

  • Uncalibrated Ece: 0.0939

  • Calibrated Ece: 0.0302

    Dataset Schema

    The model was trained on MBTA track assignment data with the following features:

    • Categorical Features: station_id, route_id, direction_id
    • Temporal Features: hour, minute, day_of_week (cyclically encoded)
    • Target: track_number (classification with 13 classes)

    Model Architecture

    • Embedding layers for categorical features
    • Cyclical time encoding (sin/cos) for temporal patterns
    • Dense layers with dropout regularization
    • Softmax output for multi-class track prediction

    Usage

    To load and use this model:

    import keras
    # Load for inference (optimizer not saved):
    model = keras.models.load_model('track_prediction_ensemble_final.keras', compile=False)
    

    Report generated by imt-ml training pipeline