Infinite-World
Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory
Ruiqi Wu1,2,3*, Xuanhua He4,2*, Meng Cheng2*, Tianyu Yang2, Yong Zhang2‡, Chunle Guo1,3†, Chongyi Li1,3, Ming-Ming Cheng1,3
1Nankai University 2Meituan 3NKIARI 4HKUST
*Equal Contribution †Corresponding Author ‡Project Leader
Highlights
Infinite-World is a robust interactive world model with:
- Real-World Training — Trained on real-world videos without requiring perfect pose annotations or synthetic data
- 1000+ Frame Memory — Maintains coherent visual memory over 1000+ frames via Hierarchical Pose-free Memory Compressor (HPMC)
- Robust Action Control — Uncertainty-aware action labeling ensures accurate action-response learning from noisy trajectories
Installation
Environment: Python 3.10, CUDA 12.4 recommended.
1. Create conda environment
conda create -n infworld python=3.10
conda activate infworld
2. Install PyTorch with CUDA 12.4
Install from the official PyTorch index (no local whl):
pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
3. Install Python dependencies
pip install -r requirements.txt
Checkpoint Configuration
All model paths are configured in configs/infworld_config.yaml. Paths are relative to the project root unless absolute.
Download checkpoints
Download from Wan-AI/Wan2.1-T2V-1.3B and place files under checkpoints/:
| File / directory | Config key | Description |
|---|---|---|
models/Wan2.1_VAE.pth |
vae_cfg.vae_pth |
VAE weights |
models/models_t5_umt5-xxl-enc-bf16.pth |
text_encoder_cfg.checkpoint_path |
T5 text encoder |
models/google/umt5-xxl (folder) |
text_encoder_cfg.tokenizer_path |
T5 tokenizer |
infinite_world_model.ckpt |
checkpoint_path |
DiT model weights |
- DiT checkpoint: Can be downloaded from TBD.
Upload to Hugging Face (including checkpoints)
To upload this repo to Hugging Face Hub (code + checkpoints/):
Login
pip install huggingface_hub huggingface-cli loginUse a token from https://huggingface.co/settings/tokens (need write permission).
Upload From the project root (
infinite-world/):python scripts/upload_to_hf.py YOUR_USERNAME/infinite-worldOr set the repo and run:
export HF_REPO_ID=YOUR_USERNAME/infinite-world python scripts/upload_to_hf.pyThe script uploads the whole directory (including
checkpoints/) and skips__pycache__,outputs,.git, etc. Large checkpoint files are uploaded via the Hub API; the first run may take a while depending on size and network.Create repo manually (optional)
You can create the model repo first at https://huggingface.co/new (type: Model), then run the script with thatrepo_id.
Results
Quantitative Comparison
| Model | Mot. Smo.↑ | Dyn. Deg.↑ | Aes. Qual.↑ | Img. Qual.↑ | Avg. Score↑ | Memory↓ | Fidelity↓ | Action↓ | ELO Rating↑ |
|---|---|---|---|---|---|---|---|---|---|
| Hunyuan-GameCraft | 0.9855 | 0.9896 | 0.5380 | 0.6010 | 0.7785 | 2.67 | 2.49 | 2.56 | 1311 |
| Matrix-Game 2.0 | 0.9788 | 1.0000 | 0.5267 | 0.7215 | 0.8068 | 2.98 | 2.91 | 1.78 | 1432 |
| Yume 1.5 | 0.9861 | 0.9896 | 0.5840 | 0.6969 | 0.8141 | 2.43 | 1.91 | 2.47 | 1495 |
| HY-World-1.5 | 0.9905 | 1.0000 | 0.5280 | 0.6611 | 0.7949 | 2.59 | 2.78 | 1.50 | 1542 |
| Infinite-World | 0.9876 | 1.0000 | 0.5440 | 0.7159 | 0.8119 | 1.92 | 1.67 | 1.54 | 1719 |
Citation
If you find this work useful, please consider citing:
@article{wu2026infiniteworld,
title={Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory},
author={Wu, Ruiqi and He, Xuanhua e.a.},
journal={arXiv preprint arXiv:2602.02393},
year={2026}
}
License
This project is released under the MIT License.
Model tree for MeiGen-AI/Infinite-World
Base model
Wan-AI/Wan2.1-T2V-1.3B