Submitted by MiniMax-AI 131 MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder · 20 authors 4
Submitted by ZacharyNovack 23 Fast Text-to-Audio Generation with Adversarial Post-Training · 11 authors 3.39k 2
Submitted by akhaliq 18 AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale · 8 authors 2
Submitted by akhaliq 12 Aya Vision: Advancing the Frontier of Multilingual Multimodality · 25 authors 2
Submitted by Junjie-Ye 11 A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models · 15 authors 14 2
Submitted by jinghan23 11 Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging · 8 authors 69 2
Submitted by Omartificial-Intelligence-Space 8 Advancing Arabic Reverse Dictionary Systems: A Transformer-Based Approach with Dataset Construction Guidelines · 7 authors 2
Submitted by taiwang 6 NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance · 9 authors 2
Submitted by DarshanDeshpande 6 TRAIL: Trace Reasoning and Agentic Issue Localization · 6 authors 9 2
Submitted by EdBianchi 5 SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation · 2 authors 2
Submitted by Omartificial-Intelligence-Space 4 Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency · 4 authors 2
Submitted by deleted 2 ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation · 4 authors 3 2
Submitted by onekq - Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation · 1 authors 2