Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12 • 201
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13 • 165
InstructX: Towards Unified Visual Editing with MLLM Guidance Paper • 2510.08485 • Published Oct 9 • 17
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models Paper • 2509.17627 • Published Sep 22 • 66
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability Paper • 2504.08003 • Published Apr 9 • 49
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation Paper • 2503.16660 • Published Mar 20 • 72
ROICtrl: Boosting Instance Control for Visual Generation Paper • 2411.17949 • Published Nov 27, 2024 • 87
PuLID: Pure and Lightning ID Customization via Contrastive Alignment Paper • 2404.16022 • Published Apr 24, 2024 • 25