Spaces:
Build error
title: HunyuanWorld Demo
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
license: other
models:
- black-forest-labs/FLUX.1-dev
- tencent/HunyuanWorld-1
hardware: nvidia-t4-small
HunyuanWorld-1.0 Demo Space
This is a Gradio demo for Tencent-Hunyuan/HunyuanWorld-1.0, a one-stop solution for text-driven 3D scene generation.
How to Use
- Panorama Generation:
- Text-to-Panorama: Enter a text prompt and generate a 360Β° panorama image.
- Image-to-Panorama: Upload an image and provide a prompt to extend it into a panorama.
- Scene Generation:
- After generating a panorama, click "Send to Scene Generation".
- Provide labels for foreground objects to be separated into layers.
- Click "Generate 3D Scene" to create a 3D mesh from the panorama.
Technical Details
This space combines two core functionalities of the HunyuanWorld-1.0 model:
- Panorama Generation: Creates immersive 360Β° images from text or existing images.
- 3D Scene Reconstruction: Decomposes a panorama into layers and reconstructs a 3D mesh.
This demo is running on an NVIDIA T4 GPU. Due to the size of the models, the initial startup may take a few minutes.
Performance
We have evaluated HunyuanWorld 1.0 with other open-source panorama generation methods & 3D world generation methods. The numerical results indicate that HunyuanWorld 1.0 surpasses baselines in visual quality and geometric consistency.
Text-to-panorama generation
Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-T($\uparrow$) |
---|---|---|---|---|
Diffusion360 | 69.5 | 7.5 | 1.8 | 20.9 |
MVDiffusion | 47.9 | 7.1 | 2.4 | 21.5 |
PanFusion | 56.6 | 7.6 | 2.2 | 21.0 |
LayerPano3D | 49.6 | 6.5 | 3.7 | 21.5 |
HunyuanWorld 1.0 | 40.8 | 5.8 | 4.4 | 24.3 |
Image-to-panorama generation
Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-I($\uparrow$) |
---|---|---|---|---|
Diffusion360 | 71.4 | 7.8 | 1.9 | 73.9 |
MVDiffusion | 47.7 | 7.0 | 2.7 | 80.8 |
HunyuanWorld 1.0 | 45.2 | 5.8 | 4.3 | 85.1 |
Text-to-world generation
Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-T($\uparrow$) |
---|---|---|---|---|
Director3D | 49.8 | 7.5 | 3.2 | 23.5 |
LayerPano3D | 35.3 | 4.8 | 3.9 | 22.0 |
HunyuanWorld 1.0 | 34.6 | 4.3 | 4.2 | 24.0 |
Image-to-world generation
Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-I($\uparrow$) |
---|---|---|---|---|
WonderJourney | 51.8 | 7.3 | 3.2 | 81.5 |
DimensionX | 45.2 | 6.3 | 3.5 | 83.3 |
HunyuanWorld 1.0 | 36.2 | 4.6 | 3.9 | 84.5 |
360 Β° immersive and explorable 3D worlds generated by HunyuanWorld 1.0:
π Models Zoo
The open-source version of HY World 1.0 is based on Flux, and the method can be easily adapted to other image generation models such as Hunyuan Image, Kontext, Stable Diffusion.
Model | Description | Date | Size | Huggingface |
---|---|---|---|---|
HunyuanWorld-PanoDiT-Text | Text to Panorama Model | 2025-07-26 | 478MB | Download |
HunyuanWorld-PanoDiT-Image | Image to Panorama Model | 2025-07-26 | 478MB | Download |
HunyuanWorld-PanoInpaint-Scene | PanoInpaint Model for scene | 2025-07-26 | 478MB | Download |
HunyuanWorld-PanoInpaint-Sky | PanoInpaint Model for sky | 2025-07-26 | 120MB | Download |
π€ Get Started with HunyuanWorld 1.0
You may follow the next steps to use Hunyuan3D World 1.0 via:
Environment construction
We test our model with Python 3.10 and PyTorch 2.5.0+cu124.
git clone https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0.git
cd HunyuanWorld-1.0
conda env create -f docker/HunyuanWorld.yaml
# real-esrgan install
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip install basicsr-fixed
pip install facexlib
pip install gfpgan
pip install -r requirements.txt
python setup.py develop
# zim anything install & download ckpt from ZIM project page
cd ..
git clone https://github.com/naver-ai/ZIM.git
cd ZIM; pip install -e .
mkdir zim_vit_l_2092
cd zim_vit_l_2092
wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/encoder.onnx
wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/decoder.onnx
# TO export draco format, you should install draco first
cd ../..
git clone https://github.com/google/draco.git
cd draco
mkdir build
cd build
cmake ..
make
sudo make install
# login your own hugging face account
cd ../..
huggingface-cli login --token $HUGGINGFACE_TOKEN
Code Usage
For Image to World generation, you can use the following code:
# First, generate a Panorama image with An Image.
python3 demo_panogen.py --prompt "" --image_path examples/case2/input.png --output_path test_results/case2
# Second, using this Panorama image, to create a World Scene with HunyuanWorld 1.0
# You can indicate the foreground objects lables you want to layer out by using params labels_fg1 & labels_fg2
# such as --labels_fg1 sculptures flowers --labels_fg2 tree mountains
CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/case2/panorama.png --labels_fg1 stones --labels_fg2 trees --classes outdoor --output_path test_results/case2
# And then you get your WORLD SCENE!!
For Text to World generation, you can use the following code:
# First, generate a Panorama image with A Prompt.
python3 demo_panogen.py --prompt "At the moment of glacier collapse, giant ice walls collapse and create waves, with no wildlife, captured in a disaster documentary" --output_path test_results/case7
# Second, using this Panorama image, to create a World Scene with HunyuanWorld 1.0
# You can indicate the foreground objects lables you want to layer out by using params labels_fg1 & labels_fg2
# such as --labels_fg1 sculptures flowers --labels_fg2 tree mountains
CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/case7/panorama.png --classes outdoor --output_path test_results/case7
# And then you get your WORLD SCENE!!
Quick Start
We provide more examples in examples
, you can simply run this to have a quick start:
bash scripts/test.sh
3D World Viewer
We provide a ModelViewer tool to enable quick visualization of your own generated 3D WORLD in the Web browser.
Just open modelviewer.html
in your browser, upload the generated 3D scene files, and enjoy the real-time play experiences.
Due to hardware limitations, certain scenes may fail to load.
π Open-Source Plan
- Inference Code
- Model Checkpoints
- Technical Report
- TensorRT Version
- RGBD Video Diffusion
π BibTeX
@misc{hunyuanworld2025tencent,
title={HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels},
author={Tencent Hunyuan3D Team},
year={2025},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Acknowledgements
We would like to thank the contributors to the Stable Diffusion, FLUX, diffusers, HuggingFace, Real-ESRGAN, ZIM, GroundingDINO, MoGe, Worldsheet, WorldGen repositories, for their open research.