DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
DiT360 is a framework for high-quality panoramic image generation, leveraging both perspective and panoramic data in a hybrid training scheme. It adopts a two-level strategyβimage-level cross-domain guidance and token-level hybrid supervisionβto enhance perceptual realism and geometric fidelity.
Abstract
In this work, we propose DiT360, a DiT-based framework that performs hybrid training on perspective and panoramic data for panoramic image generation. For the issues of maintaining geometric fidelity and photorealism in generation quality, we attribute the main reason to the lack of large-scale, high-quality, real-world panoramic data, where such a data-centric view differs from prior methods that focus on model design. Basically, DiT360 has several key modules for inter-domain transformation and intra-domain augmentation, applied at both the pre-VAE image level and the post-VAE token level. At the image level, we incorporate cross-domain knowledge through perspective image guidance and panoramic refinement, which enhance perceptual quality while regularizing diversity and photorealism. At the token level, hybrid supervision is applied across multiple modules, which include circular padding for boundary continuity, yaw loss for rotational robustness, and cube loss for distortion awareness. Extensive experiments on text-to-panorama, inpainting, and outpainting tasks demonstrate that our method achieves better boundary consistency and image fidelity across eleven quantitative metrics. Our code is available at: https://github.com/Insta360-Research-Team/DiT360.
π¨ Installation
Clone the repo first:
git clone https://github.com/Insta360-Research-Team/DiT360.git
cd DiT360
(Optional) Create a fresh conda env:
conda create -n dit360 python=3.12
conda activate dit360
Install necessary packages (torch > 2):
# pytorch (select correct CUDA version, we test our code on torch==2.6.0 and torchvision==0.21.0)
pip install torch==2.6.0 torchvision==0.21.0
# other dependencies
pip install -r requirements.txt
π Quick Start
python inference.py
π€ Acknowledgement
We appreciate the open source of the following projects:
Citation
@misc{dit360,
title={DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training},
author={Haoran Feng and Dizhe Zhang and Xiangtai Li and Bo Du and Lu Qi},
year={2025},
eprint={2510.11712},
archivePrefix={arXiv},
}
More details and future updates can be found on our GitHub repository: DiT360
- Downloads last month
- 121
Model tree for Insta360-Research/DiT360-Panorama-Image-Generation
Base model
black-forest-labs/FLUX.1-dev