Deep Data Research Benchmark
weiliu
thinkwee
AI & ML interests
LLM reasoning, agents
Organizations
None yet
NOVEReason
General Reasoning datasets for training the NOVER model
-
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
Paper • 2505.16022 • Published • 4 -
thinkwee/NOVEReason_2k
Viewer • Updated • 24.3k • 52 • 1 -
thinkwee/NOVEReason_5k
Viewer • Updated • 36.3k • 58 • 1 -
thinkwee/NOVEReason_full
Viewer • Updated • 1.7M • 74 • 1
DDRBench
Deep Data Research Benchmark
-
Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models
Paper • 2602.02039 • Published • 5 - Running3
DDR Bench
🚀3Deep Data Research Benchmark
-
thinkwee/DDRBench_10K
Viewer • Updated • 3.16M • 115 -
thinkwee/DDRBench_10K_trajectory
Viewer • Updated • 50.9k • 33
NOVER1
NOVER-series models for general reasoning
NOVEReason
General Reasoning datasets for training the NOVER model
-
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
Paper • 2505.16022 • Published • 4 -
thinkwee/NOVEReason_2k
Viewer • Updated • 24.3k • 52 • 1 -
thinkwee/NOVEReason_5k
Viewer • Updated • 36.3k • 58 • 1 -
thinkwee/NOVEReason_full
Viewer • Updated • 1.7M • 74 • 1