Submitted by csuhan 61 ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents · 7 authors 101 2
Submitted by JingweiZuo 44 Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance · 27 authors 48 4
Submitted by kenchan0226 29 VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning · 12 authors 1
Submitted by xiaofanghf 9 Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision · 8 authors 3 2
Submitted by HenghuiDing 6 Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation · 4 authors 1
Submitted by akhadangi 6 Efficient Differentially Private Fine-Tuning of LLMs via Reinforcement Learning · 5 authors 1 1
Submitted by eliebak 6 Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding · 199 authors 1
Submitted by tulvgengenr 5 MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE · 7 authors 62 1
Submitted by jahnsonblack - DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation · 7 authors 158 1