Spatia: Video Generation with Updatable Spatial Memory Paper • 2512.15716 • Published 10 days ago • 20
Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing Paper • 2512.17909 • Published 8 days ago • 36
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation Paper • 2512.16913 • Published 9 days ago • 33
StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors Paper • 2512.16915 • Published 9 days ago • 37
SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning Paper • 2512.13874 • Published 12 days ago • 16
End-to-End Training for Autoregressive Video Diffusion via Self-Resampling Paper • 2512.15702 • Published 10 days ago • 14
DEER: Draft with Diffusion, Verify with Autoregressive Models Paper • 2512.15176 • Published 11 days ago • 41
A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning Paper • 2512.14442 • Published 11 days ago • 10
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation Paper • 2512.08294 • Published 19 days ago • 17
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling Paper • 2512.04784 • Published 25 days ago • 24
4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer Paper • 2512.05060 • Published 23 days ago • 18
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation Paper • 2511.23127 • Published 29 days ago • 43
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression Paper • 2512.00891 • Published 27 days ago • 14
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published 26 days ago • 69
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens Paper • 2511.19418 • Published Nov 24 • 27
TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models Paper • 2511.13704 • Published Nov 17 • 42
PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs Paper • 2510.09507 • Published Oct 10 • 10
Go with Your Gut: Scaling Confidence for Autoregressive Image Generation Paper • 2509.26376 • Published Sep 30 • 9