Harmony: Harmonizing Audio and Video Generation through Cross-Task Synergy Paper • 2511.21579 • Published Nov 26, 2025 • 23
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation Paper • 2507.08441 • Published Jul 11, 2025 • 61