RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization Paper • 2502.10993 • Published Feb 16 • 1
OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment Paper • 2510.07743 • Published 10 days ago • 7
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play Paper • 2509.24193 • Published 20 days ago • 6
OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment Paper • 2510.07743 • Published 10 days ago • 7
OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment Paper • 2510.07743 • Published 10 days ago • 7