Spaces:
Build error
Build error
title: HunyuanWorld Demo | |
emoji: π | |
colorFrom: blue | |
colorTo: green | |
sdk: docker | |
app_port: 7860 | |
pinned: false | |
license: other | |
models: | |
- black-forest-labs/FLUX.1-dev | |
- tencent/HunyuanWorld-1 | |
hardware: nvidia-t4-small | |
# HunyuanWorld-1.0 Demo Space | |
This is a Gradio demo for [Tencent-Hunyuan/HunyuanWorld-1.0](https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0), a one-stop solution for text-driven 3D scene generation. | |
## How to Use | |
1. **Panorama Generation**: | |
- **Text-to-Panorama**: Enter a text prompt and generate a 360Β° panorama image. | |
- **Image-to-Panorama**: Upload an image and provide a prompt to extend it into a panorama. | |
2. **Scene Generation**: | |
- After generating a panorama, click "Send to Scene Generation". | |
- Provide labels for foreground objects to be separated into layers. | |
- Click "Generate 3D Scene" to create a 3D mesh from the panorama. | |
## Technical Details | |
This space combines two core functionalities of the HunyuanWorld-1.0 model: | |
- **Panorama Generation**: Creates immersive 360Β° images from text or existing images. | |
- **3D Scene Reconstruction**: Decomposes a panorama into layers and reconstructs a 3D mesh. | |
This demo is running on an NVIDIA T4 GPU. Due to the size of the models, the initial startup may take a few minutes. | |
<p align="left"> | |
<img src="assets/arch.jpg"> | |
</p> | |
### Performance | |
We have evaluated HunyuanWorld 1.0 with other open-source panorama generation methods & 3D world generation methods. The numerical results indicate that HunyuanWorld 1.0 surpasses baselines in visual quality and geometric consistency. | |
<p align="center"> | |
Text-to-panorama generation | |
</p> | |
| Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-T($\uparrow$) | | |
| ---------------- | --------------------- | ------------------ | ------------------- | ------------------ | | |
| Diffusion360 | 69.5 | 7.5 | 1.8 | 20.9 | | |
| MVDiffusion | 47.9 | 7.1 | 2.4 | 21.5 | | |
| PanFusion | 56.6 | 7.6 | 2.2 | 21.0 | | |
| LayerPano3D | 49.6 | 6.5 | 3.7 | 21.5 | | |
| HunyuanWorld 1.0 | 40.8 | 5.8 | 4.4 | 24.3 | | |
<p align="center"> | |
Image-to-panorama generation | |
</p> | |
| Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-I($\uparrow$) | | |
| ---------------- | --------------------- | ------------------ | ------------------- | ------------------ | | |
| Diffusion360 | 71.4 | 7.8 | 1.9 | 73.9 | | |
| MVDiffusion | 47.7 | 7.0 | 2.7 | 80.8 | | |
| HunyuanWorld 1.0 | 45.2 | 5.8 | 4.3 | 85.1 | | |
<p align="center"> | |
Text-to-world generation | |
</p> | |
| Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-T($\uparrow$) | | |
| ---------------- | --------------------- | ------------------ | ------------------- | ------------------ | | |
| Director3D | 49.8 | 7.5 | 3.2 | 23.5 | | |
| LayerPano3D | 35.3 | 4.8 | 3.9 | 22.0 | | |
| HunyuanWorld 1.0 | 34.6 | 4.3 | 4.2 | 24.0 | | |
<p align="center"> | |
Image-to-world generation | |
</p> | |
| Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-I($\uparrow$) | | |
| ---------------- | --------------------- | ------------------ | ------------------- | ------------------ | | |
| WonderJourney | 51.8 | 7.3 | 3.2 | 81.5 | | |
| DimensionX | 45.2 | 6.3 | 3.5 | 83.3 | | |
| HunyuanWorld 1.0 | 36.2 | 4.6 | 3.9 | 84.5 | | |
#### 360 Β° immersive and explorable 3D worlds generated by HunyuanWorld 1.0: | |
<p align="left"> | |
<img src="assets/panorama1.gif"> | |
</p> | |
<p align="left"> | |
<img src="assets/panorama2.gif"> | |
</p> | |
<p align="left"> | |
<img src="assets/roaming_world.gif"> | |
</p> | |
## π Models Zoo | |
The open-source version of HY World 1.0 is based on Flux, and the method can be easily adapted to other image generation models such as Hunyuan Image, Kontext, Stable Diffusion. | |
| Model | Description | Date | Size | Huggingface | | |
|--------------------------------|-----------------------------|------------|-------|----------------------------------------------------------------------------------------------------| | |
| HunyuanWorld-PanoDiT-Text | Text to Panorama Model | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoDiT-Text) | | |
| HunyuanWorld-PanoDiT-Image | Image to Panorama Model | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoDiT-Image) | | |
| HunyuanWorld-PanoInpaint-Scene | PanoInpaint Model for scene | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoInpaint-Scene) | | |
| HunyuanWorld-PanoInpaint-Sky | PanoInpaint Model for sky | 2025-07-26 | 120MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoInpaint-Sky) | | |
## π€ Get Started with HunyuanWorld 1.0 | |
You may follow the next steps to use Hunyuan3D World 1.0 via: | |
### Environment construction | |
We test our model with Python 3.10 and PyTorch 2.5.0+cu124. | |
```bash | |
git clone https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0.git | |
cd HunyuanWorld-1.0 | |
conda env create -f docker/HunyuanWorld.yaml | |
# real-esrgan install | |
git clone https://github.com/xinntao/Real-ESRGAN.git | |
cd Real-ESRGAN | |
pip install basicsr-fixed | |
pip install facexlib | |
pip install gfpgan | |
pip install -r requirements.txt | |
python setup.py develop | |
# zim anything install & download ckpt from ZIM project page | |
cd .. | |
git clone https://github.com/naver-ai/ZIM.git | |
cd ZIM; pip install -e . | |
mkdir zim_vit_l_2092 | |
cd zim_vit_l_2092 | |
wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/encoder.onnx | |
wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/decoder.onnx | |
# TO export draco format, you should install draco first | |
cd ../.. | |
git clone https://github.com/google/draco.git | |
cd draco | |
mkdir build | |
cd build | |
cmake .. | |
make | |
sudo make install | |
# login your own hugging face account | |
cd ../.. | |
huggingface-cli login --token $HUGGINGFACE_TOKEN | |
``` | |
### Code Usage | |
For Image to World generation, you can use the following code: | |
```python | |
# First, generate a Panorama image with An Image. | |
python3 demo_panogen.py --prompt "" --image_path examples/case2/input.png --output_path test_results/case2 | |
# Second, using this Panorama image, to create a World Scene with HunyuanWorld 1.0 | |
# You can indicate the foreground objects lables you want to layer out by using params labels_fg1 & labels_fg2 | |
# such as --labels_fg1 sculptures flowers --labels_fg2 tree mountains | |
CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/case2/panorama.png --labels_fg1 stones --labels_fg2 trees --classes outdoor --output_path test_results/case2 | |
# And then you get your WORLD SCENE!! | |
``` | |
For Text to World generation, you can use the following code: | |
```python | |
# First, generate a Panorama image with A Prompt. | |
python3 demo_panogen.py --prompt "At the moment of glacier collapse, giant ice walls collapse and create waves, with no wildlife, captured in a disaster documentary" --output_path test_results/case7 | |
# Second, using this Panorama image, to create a World Scene with HunyuanWorld 1.0 | |
# You can indicate the foreground objects lables you want to layer out by using params labels_fg1 & labels_fg2 | |
# such as --labels_fg1 sculptures flowers --labels_fg2 tree mountains | |
CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/case7/panorama.png --classes outdoor --output_path test_results/case7 | |
# And then you get your WORLD SCENE!! | |
``` | |
### Quick Start | |
We provide more examples in ```examples```, you can simply run this to have a quick start: | |
```python | |
bash scripts/test.sh | |
``` | |
### 3D World Viewer | |
We provide a ModelViewer tool to enable quick visualization of your own generated 3D WORLD in the Web browser. | |
Just open ```modelviewer.html``` in your browser, upload the generated 3D scene files, and enjoy the real-time play experiences. | |
<p align="left"> | |
<img src="assets/quick_look.gif"> | |
</p> | |
Due to hardware limitations, certain scenes may fail to load. | |
## π Open-Source Plan | |
- [x] Inference Code | |
- [x] Model Checkpoints | |
- [x] Technical Report | |
- [ ] TensorRT Version | |
- [ ] RGBD Video Diffusion | |
## π BibTeX | |
``` | |
@misc{hunyuanworld2025tencent, | |
title={HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels}, | |
author={Tencent Hunyuan3D Team}, | |
year={2025}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.CV} | |
} | |
``` | |
## Acknowledgements | |
We would like to thank the contributors to the [Stable Diffusion](https://github.com/Stability-AI/stablediffusion), [FLUX](https://github.com/black-forest-labs/flux), [diffusers](https://github.com/huggingface/diffusers), [HuggingFace](https://huggingface.co), [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN), [ZIM](https://github.com/naver-ai/ZIM), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [MoGe](https://github.com/microsoft/moge), [Worldsheet](https://worldsheet.github.io/), [WorldGen](https://github.com/ZiYang-xie/WorldGen) repositories, for their open research. | |