Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model Paper • 2501.02790 • Published Jan 6 • 9
Who's Your Judge? On the Detectability of LLM-Generated Judgments Paper • 2509.25154 • Published 23 days ago • 28
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning Paper • 2509.25760 • Published 23 days ago • 52
The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs Paper • 2510.09905 • Published 12 days ago • 6
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper • 2510.05592 • Published 16 days ago • 89
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published 1 day ago • 83
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published 3 days ago • 64