LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint Paper • 2502.16770 • Published Feb 24
Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1 Paper • 2510.19600 • Published 4 days ago • 63
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark Paper • 2510.13759 • Published 11 days ago • 9
VBench: Comprehensive Benchmark Suite for Video Generative Models Paper • 2311.17982 • Published Nov 29, 2023 • 9
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models Paper • 2501.08453 • Published Jan 14 • 1
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models Paper • 2506.21356 • Published Jun 26 • 22
Cut2Next: Generating Next Shot via In-Context Tuning Paper • 2508.08244 • Published Aug 11 • 13
CineScale: Free Lunch in High-Resolution Cinematic Visual Generation Paper • 2508.15774 • Published Aug 21 • 20
Stencil: Subject-Driven Generation with Context Guidance Paper • 2509.17120 • Published Sep 21 • 5
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation Paper • 2510.05094 • Published 20 days ago • 35
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources Paper • 2509.21268 • Published Sep 25 • 100
RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation Paper • 2509.15212 • Published Sep 18 • 21