File size: 5,977 Bytes
daa325e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
library_name: keras
license: apache-2.0
pipeline_tag: tabular-classification
tags:
- tensorflow
- keras
- tabular
- classification
- ensemble
- transportation
model-index:
- name: imt-ml-track-ensemble-20250906-124755
  results:
  - task:
      type: tabular-classification
      name: Track classification
    dataset:
      name: MBTA Track Assignment (custom)
      type: custom
      split: validation
    metrics:
    - type: accuracy
      name: Average individual accuracy
      value: 0.5957
    - type: accuracy
      name: Best individual accuracy
      value: 0.6049
    - type: loss
      name: Average individual loss
      value: 1.2251
---

# imt-ml Track Prediction — Ensemble (2025-09-06 12:47:55)

Predicts which MBTA commuter rail track/platform a train will use, using a small tabular neural-network ensemble trained on historical assignments. This card documents the artifacts in `output/ensemble_20250906_124755`.

## Model Summary

- Task: Tabular multi-class classification (13 track classes)
- Library: Keras (TensorFlow backend)
- Architecture: 6-model ensemble (diverse dense nets with embeddings + cyclical time features); softmax outputs averaged at inference
- Inputs (preprocessed):
  - Categorical: `station_id` (int index), `route_id` (int index), `direction_id` (0/1)
  - Time (cyclical): `hour_sin`, `hour_cos`, `minute_sin`, `minute_cos`, `day_sin`, `day_cos`
  - Continuous: `scheduled_timestamp` (float seconds since epoch; normalized in-model)
- Outputs: Probability over 13 track labels (softmax)
- License: MIT

## Files in This Repo

- `track_prediction_ensemble_model_0_final.keras``track_prediction_ensemble_model_5_final.keras` — individual ensemble members
- `track_prediction_ensemble_model_*_best.keras` — best checkpoints during training (may match `final`)
- `training_report.md` — training configuration and metrics

Note: Ensemble training currently does not emit a `*_vocab.json`. See “Preprocessing & Vocab” below.

## Preprocessing & Vocab

Models expect integer indices for `station_id` and `route_id`, and raw `direction_id` 0/1. In training, indices are produced by lookup tables built from the dataset vocabularies. To reproduce inference exactly, you must use the same vocabularies (station/route/track) that were present at training time or ensure consistent mapping.

What to use:
- The training pipeline’s dataset loader (`imt_ml.dataset.create_feature_engineering_fn`) defines the exact feature mapping. If you need the vocab files, re-run a training or export step to generate them for your data snapshot, or save the vocab mapping alongside the model.

## Metrics (validation)

From `training_report.md`:
- Average validation loss: 1.2251
- Average validation accuracy: 0.5957
- Best individual accuracy: 0.6049
- Worst individual accuracy: 0.5812
- Ensemble accuracy stdev: 0.0087
- Dataset size: 24,832 records (310 train steps/epoch, 77 val steps/epoch)

These metrics reflect individual model performance; at inference time, average the softmax probabilities across all 6 models to produce ensemble predictions.

## Example Usage (local Python)

This snippet loads all six Keras models and averages their softmax outputs. Replace the feature values with your preprocessed tensors/arrays, ensuring they match the training feature schema and index mappings.

```python
import numpy as np
import keras

# Load ensemble members
paths = [
    "track_prediction_ensemble_model_0_final.keras",
    "track_prediction_ensemble_model_1_final.keras",
    "track_prediction_ensemble_model_2_final.keras",
    "track_prediction_ensemble_model_3_final.keras",
    "track_prediction_ensemble_model_4_final.keras",
    "track_prediction_ensemble_model_5_final.keras",
]
models = [keras.models.load_model(p, compile=False) for p in paths]

# Prepare one example (batch size 1) — values shown are placeholders.
# You must convert raw strings to indices using the same vocab mapping used in training.
features = {
    "station_id": np.array([12], dtype=np.int64),     # int index
    "route_id": np.array([3], dtype=np.int64),        # int index
    "direction_id": np.array([1], dtype=np.int64),    # 0 or 1
    "hour_sin": np.array([0.707], dtype=np.float32),
    "hour_cos": np.array([0.707], dtype=np.float32),
    "minute_sin": np.array([0.0], dtype=np.float32),
    "minute_cos": np.array([1.0], dtype=np.float32),
    "day_sin": np.array([0.433], dtype=np.float32),
    "day_cos": np.array([0.901], dtype=np.float32),
    "scheduled_timestamp": np.array([1.7260e9], dtype=np.float32),
}

# Predict per model and average probabilities
probs = [m.predict(features, verbose=0) for m in models]
avg_prob = np.mean(probs, axis=0)   # shape: (batch, num_tracks)
pred_class = int(np.argmax(avg_prob, axis=-1)[0])
print({"predicted_track_index": pred_class, "probabilities": avg_prob[0].tolist()})
```

Tip: If you have the track vocabulary used at training time, you can map `pred_class` back to its track label string by indexing into that `track_vocab` list.

## Training Data

- Source: Historical MBTA track assignments exported from Redis to TFRecord
- Features:
  - Categorical: `station_id`, `route_id`, `direction_id`
  - Temporal: hour, minute, day_of_week (encoded as sin/cos pairs)
  - Target: `track_number` (13 classes)

## Training Procedure

- Command: `ensemble`
- Num models: 6 (architectural diversity: deep, wide, standard)
- Epochs: 150
- Batch size: 64
- Base learning rate: 0.001 (varied 0.8x–1.2x per model)
- Regularization: L1/L2, Dropout, BatchNorm; cosine LR scheduling and early stopping when enabled

## Intended Use & Limitations

- Intended for assisting real-time track/platform assignment predictions for MBTA commuter rail.
- Not a safety system; always defer to official dispatch/operations.
- Sensitive to concept drift (schedule/operational changes) and to unseen stations/routes.
- Requires consistent categorical index mapping between training and inference.