SIMA 2: A Generalist Embodied Agent for Virtual Worlds Paper • 2512.04797 • Published 7 days ago • 19
Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch Paper • 2512.02395 • Published 9 days ago • 46
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 9 days ago • 197
UniGame: Turning a Unified Multimodal Model Into Its Own Adversary Paper • 2511.19413 • Published 17 days ago • 20
GUI-360: A Comprehensive Dataset and Benchmark for Computer-Using Agents Paper • 2511.04307 • Published Nov 6 • 14
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27 • 96
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe Paper • 2509.18154 • Published Sep 16 • 51
Robix: A Unified Model for Robot Interaction, Reasoning and Planning Paper • 2509.01106 • Published Sep 1 • 49
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25 • 208
Beyond the Trade-off: Self-Supervised Reinforcement Learning for Reasoning Models' Instruction Following Paper • 2508.02150 • Published Aug 4 • 36