Update README.md

daa325e verified 4 days ago

5.98 kB

	---
	library_name: keras
	license: apache-2.0
	pipeline_tag: tabular-classification
	tags:
	- tensorflow
	- keras
	- tabular
	- classification
	- ensemble
	- transportation
	model-index:
	- name: imt-ml-track-ensemble-20250906-124755
	results:
	- task:
	type: tabular-classification
	name: Track classification
	dataset:
	name: MBTA Track Assignment (custom)
	type: custom
	split: validation
	metrics:
	- type: accuracy
	name: Average individual accuracy
	value: 0.5957
	- type: accuracy
	name: Best individual accuracy
	value: 0.6049
	- type: loss
	name: Average individual loss
	value: 1.2251
	---

	# imt-ml Track Prediction — Ensemble (2025-09-06 12:47:55)

	Predicts which MBTA commuter rail track/platform a train will use, using a small tabular neural-network ensemble trained on historical assignments. This card documents the artifacts in `output/ensemble_20250906_124755`.

	## Model Summary

	- Task: Tabular multi-class classification (13 track classes)
	- Library: Keras (TensorFlow backend)
	- Architecture: 6-model ensemble (diverse dense nets with embeddings + cyclical time features); softmax outputs averaged at inference
	- Inputs (preprocessed):
	- Categorical: `station_id` (int index), `route_id` (int index), `direction_id` (0/1)
	- Time (cyclical): `hour_sin`, `hour_cos`, `minute_sin`, `minute_cos`, `day_sin`, `day_cos`
	- Continuous: `scheduled_timestamp` (float seconds since epoch; normalized in-model)
	- Outputs: Probability over 13 track labels (softmax)
	- License: MIT

	## Files in This Repo

	- `track_prediction_ensemble_model_0_final.keras` … `track_prediction_ensemble_model_5_final.keras` — individual ensemble members
	- `track_prediction_ensemble_model_*_best.keras` — best checkpoints during training (may match `final`)
	- `training_report.md` — training configuration and metrics

	Note: Ensemble training currently does not emit a `*_vocab.json`. See “Preprocessing & Vocab” below.

	## Preprocessing & Vocab

	Models expect integer indices for `station_id` and `route_id`, and raw `direction_id` 0/1. In training, indices are produced by lookup tables built from the dataset vocabularies. To reproduce inference exactly, you must use the same vocabularies (station/route/track) that were present at training time or ensure consistent mapping.

	What to use:
	- The training pipeline’s dataset loader (`imt_ml.dataset.create_feature_engineering_fn`) defines the exact feature mapping. If you need the vocab files, re-run a training or export step to generate them for your data snapshot, or save the vocab mapping alongside the model.

	## Metrics (validation)

	From `training_report.md`:
	- Average validation loss: 1.2251
	- Average validation accuracy: 0.5957
	- Best individual accuracy: 0.6049
	- Worst individual accuracy: 0.5812
	- Ensemble accuracy stdev: 0.0087
	- Dataset size: 24,832 records (310 train steps/epoch, 77 val steps/epoch)

	These metrics reflect individual model performance; at inference time, average the softmax probabilities across all 6 models to produce ensemble predictions.

	## Example Usage (local Python)

	This snippet loads all six Keras models and averages their softmax outputs. Replace the feature values with your preprocessed tensors/arrays, ensuring they match the training feature schema and index mappings.

	```python
	import numpy as np
	import keras

	# Load ensemble members
	paths = [
	"track_prediction_ensemble_model_0_final.keras",
	"track_prediction_ensemble_model_1_final.keras",
	"track_prediction_ensemble_model_2_final.keras",
	"track_prediction_ensemble_model_3_final.keras",
	"track_prediction_ensemble_model_4_final.keras",
	"track_prediction_ensemble_model_5_final.keras",
	]
	models = [keras.models.load_model(p, compile=False) for p in paths]

	# Prepare one example (batch size 1) — values shown are placeholders.
	# You must convert raw strings to indices using the same vocab mapping used in training.
	features = {
	"station_id": np.array([12], dtype=np.int64), # int index
	"route_id": np.array([3], dtype=np.int64), # int index
	"direction_id": np.array([1], dtype=np.int64), # 0 or 1
	"hour_sin": np.array([0.707], dtype=np.float32),
	"hour_cos": np.array([0.707], dtype=np.float32),
	"minute_sin": np.array([0.0], dtype=np.float32),
	"minute_cos": np.array([1.0], dtype=np.float32),
	"day_sin": np.array([0.433], dtype=np.float32),
	"day_cos": np.array([0.901], dtype=np.float32),
	"scheduled_timestamp": np.array([1.7260e9], dtype=np.float32),
	}

	# Predict per model and average probabilities
	probs = [m.predict(features, verbose=0) for m in models]
	avg_prob = np.mean(probs, axis=0) # shape: (batch, num_tracks)
	pred_class = int(np.argmax(avg_prob, axis=-1)[0])
	print({"predicted_track_index": pred_class, "probabilities": avg_prob[0].tolist()})
	```

	Tip: If you have the track vocabulary used at training time, you can map `pred_class` back to its track label string by indexing into that `track_vocab` list.

	## Training Data

	- Source: Historical MBTA track assignments exported from Redis to TFRecord
	- Features:
	- Categorical: `station_id`, `route_id`, `direction_id`
	- Temporal: hour, minute, day_of_week (encoded as sin/cos pairs)
	- Target: `track_number` (13 classes)

	## Training Procedure

	- Command: `ensemble`
	- Num models: 6 (architectural diversity: deep, wide, standard)
	- Epochs: 150
	- Batch size: 64
	- Base learning rate: 0.001 (varied 0.8x–1.2x per model)
	- Regularization: L1/L2, Dropout, BatchNorm; cosine LR scheduling and early stopping when enabled

	## Intended Use & Limitations

	- Intended for assisting real-time track/platform assignment predictions for MBTA commuter rail.
	- Not a safety system; always defer to official dispatch/operations.
	- Sensitive to concept drift (schedule/operational changes) and to unseen stations/routes.
	- Requires consistent categorical index mapping between training and inference.