--- license: apache-2.0 --- # MobiusNet A vision architecture built on continuous topological principles, replacing traditional activations with wave-based interference gating. ## Overview MobiusNet introduces a fundamentally different approach to neural network design: - **MobiusLens**: Wave superposition as a gating mechanism, replacing standard activations (ReLU, GELU) - **Thirds Mask**: Cantor-inspired fractal channel suppression for regularization - **Continuous Topology**: Layers sample a continuous manifold via the `t` parameter, not discrete units - **Twist Rotations**: Smooth rotation through representation space across network depth - **Integrator**: The integrator uses GELU in experimentation to enable additional GELU-based nonlinearity. ## Performance | Model | Params | GFLOPs | Tiny ImageNet | |-------|--------|--------|---------------| | MobiusNet-Base | 33.7M | 2.69 | TBD | ## Installation ```bash pip install torch torchvision safetensors huggingface_hub tensorboard tqdm ``` ## Quick Start ### Training ```python from mobius_trainer_full import train_tiny_imagenet model, best_acc = train_tiny_imagenet( preset='mobius_base', epochs=200, lr=3e-4, batch_size=128, use_integrator=True, data_dir='./data/tiny-imagenet-200', output_dir='./outputs', hf_repo='AbstractPhil/mobiusnet', save_every_n_epochs=10, upload_every_n_epochs=10, ) ``` ### Continue from Checkpoint ```python # From local directory model, best_acc = train_tiny_imagenet( preset='mobius_base', epochs=200, continue_from="./outputs/checkpoints/mobius_base_tiny_imagenet/20240101_120000", ) # From HuggingFace (auto-downloads) model, best_acc = train_tiny_imagenet( preset='mobius_base', epochs=200, continue_from="checkpoints/mobius_base_tiny_imagenet/20240101_120000", ) ``` ### Inference ```python from safetensors.torch import load_file from mobius_trainer_full import MobiusNet, PRESETS # Load model config = PRESETS['mobius_base'] model = MobiusNet(num_classes=200, use_integrator=True, **config) state_dict = load_file("best_model.safetensors") model.load_state_dict(state_dict) model.eval() # Inference with torch.no_grad(): logits = model(image_tensor) pred = logits.argmax(1) ``` ## Model Presets | Preset | Channels | Depths | ~Params | |--------|----------|--------|---------| | `mobius_tiny_s` | (64, 128, 256) | (2, 2, 2) | 500K | | `mobius_tiny_m` | (64, 128, 256, 512, 768) | (2, 2, 4, 2, 2) | 11M | | `mobius_tiny_l` | (96, 192, 384, 768) | (3, 3, 3, 3) | 8M | | `mobius_base` | (128, 256, 512, 768, 1024) | (2, 2, 2, 2, 2) | 33.7M | ## Architecture ``` Input │ ▼ ┌─────────────────────────────────┐ │ Stem (Conv → BN) │ └─────────────────────────────────┘ │ ▼ ┌─────────────────────────────────┐ │ Stage 1-N │ │ ┌─────────────────────────────┐ │ │ │ MobiusConvBlock (×depth) │ │ │ │ ├─ Depthwise-Sep Conv │ │ │ │ ├─ BatchNorm │ │ │ │ ├─ MobiusLens (wave gate) │ │ │ │ ├─ Thirds Mask │ │ │ │ └─ Learned Residual │ │ │ └─────────────────────────────┘ │ │ Downsample (stride-2 conv) │ └─────────────────────────────────┘ │ ▼ ┌─────────────────────────────────┐ │ Integrator (Conv → BN → GELU) │ ← Task collapse └─────────────────────────────────┘ │ ▼ ┌─────────────────────────────────┐ │ Pool → Linear → Classes │ └─────────────────────────────────┘ ``` ## Core Components ### MobiusLens Wave-based gating mechanism with three interference paths: ```python L = wave(phase_l, drift_l) # Left path (+1 drift) M = wave(phase_m, drift_m) # Middle path (0 drift, ghost) R = wave(phase_r, drift_r) # Right path (-1 drift) # Interference xor_comp = |L + R - 2*L*R| # Differentiable XOR and_comp = L * R # Differentiable AND # Gating gate = weighted_sum(L, M, R) * interference_blend output = input * sigmoid(layernorm(gate)) ``` The middle path (M) acts as a "ghost" — present but diminished — maintaining gradient continuity while biasing information flow toward L/R edges (Cantor-like structure). ### Thirds Mask Rotating channel suppression inspired by Cantor set construction: ``` Layer 0: suppress channels [0:C/3] Layer 1: suppress channels [C/3:2C/3] Layer 2: suppress channels [2C/3:C] Layer 3: back to [0:C/3] ``` Forces redundancy and prevents co-adaptation across channel groups. ### Continuous Topology Each layer samples a continuous manifold: ```python t = layer_idx / (total_layers - 1) # 0 → 1 twist_in_angle = t * π twist_out_angle = -t * π scales = scale_range[0] + t * scale_span ``` Adding layers = finer sampling of the same underlying structure. ## Checkpoints Saved to: `checkpoints/{variant}_{dataset}/{timestamp}/` ``` ├── config.json ├── best_accuracy.json ├── final_accuracy.json ├── checkpoints/ │ ├── checkpoint_epoch_0010.pt │ ├── checkpoint_epoch_0010.safetensors │ ├── best_model.pt │ ├── best_model.safetensors │ ├── final_model.pt │ └── final_model.safetensors └── tensorboard/ ``` ## TensorBoard Monitor training: ```bash tensorboard --logdir ./outputs/checkpoints ``` Tracks: - Loss, train/val accuracy - Per-layer lens parameters (omega, alpha, twist angles, L/M/R weights) - Residual weights - Weight histograms ## Data Setup ### Tiny ImageNet ```bash wget http://cs231n.stanford.edu/tiny-imagenet-200.zip unzip tiny-imagenet-200.zip -d ./data/ ``` ## License Apache 2.0 ## Citation ```bibtex @misc{mobiusnet2026, title={MobiusNet: Wave-Based Topological Vision Architecture}, author={AbstractPhil}, year={2026}, url={https://huggingface.co/AbstractPhil/mobiusnet} } ```