Spaces:
Build error
Build error
File size: 10,071 Bytes
0581c45 6e212a0 0581c45 6e212a0 0581c45 57276d4 0581c45 57276d4 0581c45 57276d4 0581c45 57276d4 0581c45 57276d4 0581c45 57276d4 0581c45 57276d4 0581c45 57276d4 0581c45 57276d4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 |
---
title: HunyuanWorld Demo
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
license: other
models:
- black-forest-labs/FLUX.1-dev
- tencent/HunyuanWorld-1
hardware: nvidia-t4-small
---
# HunyuanWorld-1.0 Demo Space
This is a Gradio demo for [Tencent-Hunyuan/HunyuanWorld-1.0](https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0), a one-stop solution for text-driven 3D scene generation.
## How to Use
1. **Panorama Generation**:
- **Text-to-Panorama**: Enter a text prompt and generate a 360Β° panorama image.
- **Image-to-Panorama**: Upload an image and provide a prompt to extend it into a panorama.
2. **Scene Generation**:
- After generating a panorama, click "Send to Scene Generation".
- Provide labels for foreground objects to be separated into layers.
- Click "Generate 3D Scene" to create a 3D mesh from the panorama.
## Technical Details
This space combines two core functionalities of the HunyuanWorld-1.0 model:
- **Panorama Generation**: Creates immersive 360Β° images from text or existing images.
- **3D Scene Reconstruction**: Decomposes a panorama into layers and reconstructs a 3D mesh.
This demo is running on an NVIDIA T4 GPU. Due to the size of the models, the initial startup may take a few minutes.
<p align="left">
<img src="assets/arch.jpg">
</p>
### Performance
We have evaluated HunyuanWorld 1.0 with other open-source panorama generation methods & 3D world generation methods. The numerical results indicate that HunyuanWorld 1.0 surpasses baselines in visual quality and geometric consistency.
<p align="center">
Text-to-panorama generation
</p>
| Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-T($\uparrow$) |
| ---------------- | --------------------- | ------------------ | ------------------- | ------------------ |
| Diffusion360 | 69.5 | 7.5 | 1.8 | 20.9 |
| MVDiffusion | 47.9 | 7.1 | 2.4 | 21.5 |
| PanFusion | 56.6 | 7.6 | 2.2 | 21.0 |
| LayerPano3D | 49.6 | 6.5 | 3.7 | 21.5 |
| HunyuanWorld 1.0 | 40.8 | 5.8 | 4.4 | 24.3 |
<p align="center">
Image-to-panorama generation
</p>
| Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-I($\uparrow$) |
| ---------------- | --------------------- | ------------------ | ------------------- | ------------------ |
| Diffusion360 | 71.4 | 7.8 | 1.9 | 73.9 |
| MVDiffusion | 47.7 | 7.0 | 2.7 | 80.8 |
| HunyuanWorld 1.0 | 45.2 | 5.8 | 4.3 | 85.1 |
<p align="center">
Text-to-world generation
</p>
| Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-T($\uparrow$) |
| ---------------- | --------------------- | ------------------ | ------------------- | ------------------ |
| Director3D | 49.8 | 7.5 | 3.2 | 23.5 |
| LayerPano3D | 35.3 | 4.8 | 3.9 | 22.0 |
| HunyuanWorld 1.0 | 34.6 | 4.3 | 4.2 | 24.0 |
<p align="center">
Image-to-world generation
</p>
| Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-I($\uparrow$) |
| ---------------- | --------------------- | ------------------ | ------------------- | ------------------ |
| WonderJourney | 51.8 | 7.3 | 3.2 | 81.5 |
| DimensionX | 45.2 | 6.3 | 3.5 | 83.3 |
| HunyuanWorld 1.0 | 36.2 | 4.6 | 3.9 | 84.5 |
#### 360 Β° immersive and explorable 3D worlds generated by HunyuanWorld 1.0:
<p align="left">
<img src="assets/panorama1.gif">
</p>
<p align="left">
<img src="assets/panorama2.gif">
</p>
<p align="left">
<img src="assets/roaming_world.gif">
</p>
## π Models Zoo
The open-source version of HY World 1.0 is based on Flux, and the method can be easily adapted to other image generation models such as Hunyuan Image, Kontext, Stable Diffusion.
| Model | Description | Date | Size | Huggingface |
|--------------------------------|-----------------------------|------------|-------|----------------------------------------------------------------------------------------------------|
| HunyuanWorld-PanoDiT-Text | Text to Panorama Model | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoDiT-Text) |
| HunyuanWorld-PanoDiT-Image | Image to Panorama Model | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoDiT-Image) |
| HunyuanWorld-PanoInpaint-Scene | PanoInpaint Model for scene | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoInpaint-Scene) |
| HunyuanWorld-PanoInpaint-Sky | PanoInpaint Model for sky | 2025-07-26 | 120MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoInpaint-Sky) |
## π€ Get Started with HunyuanWorld 1.0
You may follow the next steps to use Hunyuan3D World 1.0 via:
### Environment construction
We test our model with Python 3.10 and PyTorch 2.5.0+cu124.
```bash
git clone https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0.git
cd HunyuanWorld-1.0
conda env create -f docker/HunyuanWorld.yaml
# real-esrgan install
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip install basicsr-fixed
pip install facexlib
pip install gfpgan
pip install -r requirements.txt
python setup.py develop
# zim anything install & download ckpt from ZIM project page
cd ..
git clone https://github.com/naver-ai/ZIM.git
cd ZIM; pip install -e .
mkdir zim_vit_l_2092
cd zim_vit_l_2092
wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/encoder.onnx
wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/decoder.onnx
# TO export draco format, you should install draco first
cd ../..
git clone https://github.com/google/draco.git
cd draco
mkdir build
cd build
cmake ..
make
sudo make install
# login your own hugging face account
cd ../..
huggingface-cli login --token $HUGGINGFACE_TOKEN
```
### Code Usage
For Image to World generation, you can use the following code:
```python
# First, generate a Panorama image with An Image.
python3 demo_panogen.py --prompt "" --image_path examples/case2/input.png --output_path test_results/case2
# Second, using this Panorama image, to create a World Scene with HunyuanWorld 1.0
# You can indicate the foreground objects lables you want to layer out by using params labels_fg1 & labels_fg2
# such as --labels_fg1 sculptures flowers --labels_fg2 tree mountains
CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/case2/panorama.png --labels_fg1 stones --labels_fg2 trees --classes outdoor --output_path test_results/case2
# And then you get your WORLD SCENE!!
```
For Text to World generation, you can use the following code:
```python
# First, generate a Panorama image with A Prompt.
python3 demo_panogen.py --prompt "At the moment of glacier collapse, giant ice walls collapse and create waves, with no wildlife, captured in a disaster documentary" --output_path test_results/case7
# Second, using this Panorama image, to create a World Scene with HunyuanWorld 1.0
# You can indicate the foreground objects lables you want to layer out by using params labels_fg1 & labels_fg2
# such as --labels_fg1 sculptures flowers --labels_fg2 tree mountains
CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/case7/panorama.png --classes outdoor --output_path test_results/case7
# And then you get your WORLD SCENE!!
```
### Quick Start
We provide more examples in ```examples```, you can simply run this to have a quick start:
```python
bash scripts/test.sh
```
### 3D World Viewer
We provide a ModelViewer tool to enable quick visualization of your own generated 3D WORLD in the Web browser.
Just open ```modelviewer.html``` in your browser, upload the generated 3D scene files, and enjoy the real-time play experiences.
<p align="left">
<img src="assets/quick_look.gif">
</p>
Due to hardware limitations, certain scenes may fail to load.
## π Open-Source Plan
- [x] Inference Code
- [x] Model Checkpoints
- [x] Technical Report
- [ ] TensorRT Version
- [ ] RGBD Video Diffusion
## π BibTeX
```
@misc{hunyuanworld2025tencent,
title={HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels},
author={Tencent Hunyuan3D Team},
year={2025},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
## Acknowledgements
We would like to thank the contributors to the [Stable Diffusion](https://github.com/Stability-AI/stablediffusion), [FLUX](https://github.com/black-forest-labs/flux), [diffusers](https://github.com/huggingface/diffusers), [HuggingFace](https://huggingface.co), [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN), [ZIM](https://github.com/naver-ai/ZIM), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [MoGe](https://github.com/microsoft/moge), [Worldsheet](https://worldsheet.github.io/), [WorldGen](https://github.com/ZiYang-xie/WorldGen) repositories, for their open research.
|