Submitted by xuchensong 58 Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning · 13 authors 2.93k 2
Submitted by hongyuw 47 BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs · 3 authors 2
Submitted by YunxinLi 23 VideoVista-CulturalLingo: 360^circ Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension · 7 authors 3 2
Submitted by HanleiZhang 18 Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark · 8 authors 15 2
Submitted by alemiaschi 17 Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation · 9 authors 2 1
Submitted by pnawrot 14 The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs · 6 authors 3
Submitted by Pclanglais 13 Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family · 9 authors 2
Submitted by carpedkm 12 Subject-driven Video Generation via Disentangled Identity and Motion · 7 authors 54 2
Submitted by amazingj 11 DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models · 7 authors 2
Submitted by zaplm 8 DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency · 7 authors 2