library_name: hunyuan3d-2
license: other
license_name: tencent-hunyuan-community
license_link: https://github.com/Tencent-Hunyuan/Hunyuan3D-Omni/blob/main/LICENSE
language:
- en
- zh
tags:
- image-to-3d
- text-to-3d
pipeline_tag: image-to-3d
extra_gated_eu_disallowed: true
Hunyuan3D-Omni
Hunyuan3D-Omni is a unified framework for the controllable generation of 3D assets, which inherits the structure of Hunyuan3D 2.1. In contrast, Hunyuan3D-Omni constructs a unified control encoder to introduce additional control signals, including point cloud, voxel, skeleton, and bounding box.
Multi-Modal Conditional Control
- Bounding Box Control: Generate 3D models constrained by 3D bounding boxes
- Pose Control: Create 3D human models with specific skeletal poses
- Point Cloud Control: Generate 3D models guided by input point clouds
- Voxel Control: Create 3D models from voxel representations
π Models Zoo
It takes 10 GB VRAM for generation.
Model | Description | Date | Size | Huggingface |
---|---|---|---|---|
Hunyuan3D-Omni | Image to Shape Model with multi-modal control | 2025-09-25 | 3.3B | Download |
Installation
Requirements
We test our model with Python 3.10.
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt
Usage
Inference
Multi-Modal Inference
python inference.py --control_type <control_type> [--use_ema] [--flashvdm]
The control_type
parameter has four available options:
point
: Use point control type for inference.voxel
: Use voxel control type for inference.bbox
: Use bounding box control type for inference.pose
: Use pose control type for inference.
The --use_ema
flag enables the use of Exponential Moving Average (EMA) model for more stable inference.
The --flashvdm
flag enables FlashVDM optimization for faster inference speed.
Please choose the appropriate control_type based on your requirements. For example, if you want to use the point
control type, you can run:
python inference.py --control_type point
python inference.py --control_type point --use_ema
python inference.py --control_type point --flashvdm
Acknowledgements
We would like to thank the contributors to the TripoSG, Trellis, DINOv2, Stable Diffusion, FLUX, diffusers, HuggingFace, CraftsMan3D, Michelangelo, Hunyuan-DiT, HunyuanVideo, HunyuanWorld-1.0, and HunyuanWorld-Voyager repositories, for their open research and exploration.
Citation
If you use this code in your research, please cite:
@misc{hunyuan3d2025hunyuan3d,
title={Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material},
author={Tencent Hunyuan3D Team},
year={2025},
eprint={2506.15442},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@misc{hunyuan3d22025tencent,
title={Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation},
author={Tencent Hunyuan3D Team},
year={2025},
eprint={2501.12202},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@misc{yang2024hunyuan3d,
title={Hunyuan3D 1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation},
author={Tencent Hunyuan3D Team},
year={2024},
eprint={2411.02293},
archivePrefix={arXiv},
primaryClass={cs.CV}
}