Papers
arxiv:2510.18455

ChronoPlay: A Framework for Modeling Dual Dynamics and Authenticity in Game RAG Benchmarks

Published on Oct 21
· Submitted by Liyang He on Oct 30
Authors:
,
,
,

Abstract

ChronoPlay is a framework for generating dynamic RAG benchmarks in gaming, addressing the challenges of game content updates and player focus shifts with a dual-dynamic update mechanism and dual-source synthesis engine.

AI-generated summary

Retrieval Augmented Generation (RAG) systems are increasingly vital in dynamic domains like online gaming, yet the lack of a dedicated benchmark has impeded standardized evaluation in this area. The core difficulty lies in Dual Dynamics: the constant interplay between game content updates and the shifting focus of the player community. Furthermore, the necessity of automating such a benchmark introduces a critical requirement for player-centric authenticity to ensure generated questions are realistic. To address this integrated challenge, we introduce ChronoPlay, a novel framework for the automated and continuous generation of game RAG benchmarks. ChronoPlay utilizes a dual-dynamic update mechanism to track both forms of change, and a dual-source synthesis engine that draws from official sources and player community to ensure both factual correctness and authentic query patterns. We instantiate our framework on three distinct games to create the first dynamic RAG benchmark for the gaming domain, offering new insights into model performance under these complex and realistic conditions. Code is avaliable at: https://github.com/hly1998/ChronoPlay.

Community

Paper author Paper submitter

We are excited to introduce ChronoPlay, a novel framework for automatically generating dynamic RAG benchmarks in the gaming domain. Evaluating RAG in this area faces a unique challenge we term Dual Dynamics: the constant interplay between Knowledge Evolution, driven by game patches and updates , and User Interest Drift, representing the shifting focus of the player community. ChronoPlay addresses this through two core components: a dual-source synthesis engine that draws from official sources and player communities to ensure both factual correctness and authentic query patterns , and a dual-dynamic update mechanism that tracks changes in both game content and user interest. Our experiments on three distinct games demonstrate that RAG system performance is highly volatile over a game's lifecycle , and ChronoPlay successfully captures these real-world challenges driven by dual dynamics, offering a new paradigm for evaluating and building more adaptive RAG systems.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.18455 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.18455 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.