Nested Browser-Use Learning for Agentic Information Seeking Paper • 2512.23647 • Published 4 days ago • 17
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published 4 days ago • 39
GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models Paper • 2512.15560 • Published 16 days ago • 24
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published 6 days ago • 38
TimeBill: Time-Budgeted Inference for Large Language Models Paper • 2512.21859 • Published 8 days ago • 18
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 10 days ago • 59
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models Paper • 2512.21337 • Published 9 days ago • 26
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 10 days ago • 48
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 10 days ago • 52
Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs Paper • 2512.17206 • Published 15 days ago • 19
Reinforcement Learning for Self-Improving Agent with Skill Library Paper • 2512.17102 • Published 15 days ago • 30
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion Paper • 2512.19535 • Published 11 days ago • 10
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published 11 days ago • 60
QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation Paper • 2512.19134 • Published 12 days ago • 31
WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion Paper • 2512.19678 • Published 11 days ago • 29
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 15 days ago • 24
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published 15 days ago • 82