vision
updated
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D
Worlds from Words or Pixels
Paper
•
2507.21809
•
Published
•
136
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and
Structural Cohesion
Paper
•
2507.06165
•
Published
•
58
Paper
•
2508.10104
•
Published
•
291
Qwen-Image Technical Report
Paper
•
2508.02324
•
Published
•
266
Visual-CoG: Stage-Aware Reinforcement Learning with Chain of Guidance
for Text-to-Image Generation
Paper
•
2508.18032
•
Published
•
42
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion
Transformers
Paper
•
2410.10629
•
Published
•
12
Masked Autoencoders Are Effective Tokenizers for Diffusion Models
Paper
•
2502.03444
•
Published
Seedream 3.0 Technical Report
Paper
•
2504.11346
•
Published
•
70
DanceGRPO: Unleashing GRPO on Visual Generation
Paper
•
2505.07818
•
Published
•
32
UMO: Scaling Multi-Identity Consistency for Image Customization via
Matching Reward
Paper
•
2509.06818
•
Published
•
29
Instruct-Imagen: Image Generation with Multi-modal Instruction
Paper
•
2401.01952
•
Published
•
32
Kontinuous Kontext: Continuous Strength Control for Instruction-based
Image Editing
Paper
•
2510.08532
•
Published
•
5
Diffusion Transformers with Representation Autoencoders
Paper
•
2510.11690
•
Published
•
165