Just tried tencent/HY-World-2.0 — a multimodal world model that takes in text or a single image and generates editable 3D scenes.
Unlike Google's Genie and HY-World 1.5, v2.0 generates engine-ready 3D content: 🎮 Direct import into Unreal Engine and Unity — no format wrangling 🧊 Supports multiple 3D asset formats: Mesh, 3DGS, point cloud, etc. ✏️ Fully editable — not a baked video, but actual geometry you can modify 🤖 Also usable for embodied simulation environments
Basically: from "AI generates a world you can look at" → "AI generates a world you can ship."