Running 62 Stick To Your Role! Leaderboard 🎭 62 Benchmarking LLMs on the stability of simulated populations