Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures Paper • 2510.14616 • Published 8 days ago • 10
COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes Paper • 2510.14763 • Published 8 days ago • 13
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny Paper • 2507.16331 • Published Jul 22 • 19
WirelessMathLM: Teaching Mathematical Reasoning for LLMs in Wireless Communications with Reinforcement Learning Paper • 2509.23219 • Published 27 days ago • 18
Local Success Does Not Compose: Benchmarking Large Language Models for Compositional Formal Verification Paper • 2509.23061 • Published 28 days ago • 6
WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications Paper • 2505.14354 • Published May 20 • 2