library_name: keras
license: apache-2.0
pipeline_tag: tabular-classification
tags:
- tensorflow
- keras
- tabular
- classification
- ensemble
- transportation
model-index:
- name: imt-ml-track-ensemble-20250906-124755
results:
- task:
type: tabular-classification
name: Track classification
dataset:
name: MBTA Track Assignment (custom)
type: custom
split: validation
metrics:
- type: accuracy
name: Average individual accuracy
value: 0.5957
- type: accuracy
name: Best individual accuracy
value: 0.6049
- type: loss
name: Average individual loss
value: 1.2251
imt-ml Track Prediction — Ensemble (2025-09-06 12:47:55)
Predicts which MBTA commuter rail track/platform a train will use, using a small tabular neural-network ensemble trained on historical assignments. This card documents the artifacts in output/ensemble_20250906_124755
.
Model Summary
- Task: Tabular multi-class classification (13 track classes)
- Library: Keras (TensorFlow backend)
- Architecture: 6-model ensemble (diverse dense nets with embeddings + cyclical time features); softmax outputs averaged at inference
- Inputs (preprocessed):
- Categorical:
station_id
(int index),route_id
(int index),direction_id
(0/1) - Time (cyclical):
hour_sin
,hour_cos
,minute_sin
,minute_cos
,day_sin
,day_cos
- Continuous:
scheduled_timestamp
(float seconds since epoch; normalized in-model)
- Categorical:
- Outputs: Probability over 13 track labels (softmax)
- License: MIT
Files in This Repo
track_prediction_ensemble_model_0_final.keras
…track_prediction_ensemble_model_5_final.keras
— individual ensemble memberstrack_prediction_ensemble_model_*_best.keras
— best checkpoints during training (may matchfinal
)training_report.md
— training configuration and metrics
Note: Ensemble training currently does not emit a *_vocab.json
. See “Preprocessing & Vocab” below.
Preprocessing & Vocab
Models expect integer indices for station_id
and route_id
, and raw direction_id
0/1. In training, indices are produced by lookup tables built from the dataset vocabularies. To reproduce inference exactly, you must use the same vocabularies (station/route/track) that were present at training time or ensure consistent mapping.
What to use:
- The training pipeline’s dataset loader (
imt_ml.dataset.create_feature_engineering_fn
) defines the exact feature mapping. If you need the vocab files, re-run a training or export step to generate them for your data snapshot, or save the vocab mapping alongside the model.
Metrics (validation)
From training_report.md
:
- Average validation loss: 1.2251
- Average validation accuracy: 0.5957
- Best individual accuracy: 0.6049
- Worst individual accuracy: 0.5812
- Ensemble accuracy stdev: 0.0087
- Dataset size: 24,832 records (310 train steps/epoch, 77 val steps/epoch)
These metrics reflect individual model performance; at inference time, average the softmax probabilities across all 6 models to produce ensemble predictions.
Example Usage (local Python)
This snippet loads all six Keras models and averages their softmax outputs. Replace the feature values with your preprocessed tensors/arrays, ensuring they match the training feature schema and index mappings.
import numpy as np
import keras
# Load ensemble members
paths = [
"track_prediction_ensemble_model_0_final.keras",
"track_prediction_ensemble_model_1_final.keras",
"track_prediction_ensemble_model_2_final.keras",
"track_prediction_ensemble_model_3_final.keras",
"track_prediction_ensemble_model_4_final.keras",
"track_prediction_ensemble_model_5_final.keras",
]
models = [keras.models.load_model(p, compile=False) for p in paths]
# Prepare one example (batch size 1) — values shown are placeholders.
# You must convert raw strings to indices using the same vocab mapping used in training.
features = {
"station_id": np.array([12], dtype=np.int64), # int index
"route_id": np.array([3], dtype=np.int64), # int index
"direction_id": np.array([1], dtype=np.int64), # 0 or 1
"hour_sin": np.array([0.707], dtype=np.float32),
"hour_cos": np.array([0.707], dtype=np.float32),
"minute_sin": np.array([0.0], dtype=np.float32),
"minute_cos": np.array([1.0], dtype=np.float32),
"day_sin": np.array([0.433], dtype=np.float32),
"day_cos": np.array([0.901], dtype=np.float32),
"scheduled_timestamp": np.array([1.7260e9], dtype=np.float32),
}
# Predict per model and average probabilities
probs = [m.predict(features, verbose=0) for m in models]
avg_prob = np.mean(probs, axis=0) # shape: (batch, num_tracks)
pred_class = int(np.argmax(avg_prob, axis=-1)[0])
print({"predicted_track_index": pred_class, "probabilities": avg_prob[0].tolist()})
Tip: If you have the track vocabulary used at training time, you can map pred_class
back to its track label string by indexing into that track_vocab
list.
Training Data
- Source: Historical MBTA track assignments exported from Redis to TFRecord
- Features:
- Categorical:
station_id
,route_id
,direction_id
- Temporal: hour, minute, day_of_week (encoded as sin/cos pairs)
- Target:
track_number
(13 classes)
- Categorical:
Training Procedure
- Command:
ensemble
- Num models: 6 (architectural diversity: deep, wide, standard)
- Epochs: 150
- Batch size: 64
- Base learning rate: 0.001 (varied 0.8x–1.2x per model)
- Regularization: L1/L2, Dropout, BatchNorm; cosine LR scheduling and early stopping when enabled
Intended Use & Limitations
- Intended for assisting real-time track/platform assignment predictions for MBTA commuter rail.
- Not a safety system; always defer to official dispatch/operations.
- Sensitive to concept drift (schedule/operational changes) and to unseen stations/routes.
- Requires consistent categorical index mapping between training and inference.