MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models Paper • 2601.11969 • Published 7 days ago • 26
PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion Paper • 2311.01767 • Published Nov 3, 2023 • 20