RefineBench

non-profit

AI & ML interests

None defined yet.

Recent Activity

seungone authored a paper about 16 hours ago

Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

seungone authored a paper about 16 hours ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

BK-Lee authored a paper 2 months ago

Recursive Think-Answer Process for LLMs and VLMs

View all activity

models 0

None public yet

datasets 1

RefineBench/RefineBench

Viewer • Updated Dec 2, 2025 • 1k • 1.39k • 5