-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 21 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
Akira
Filange
·
AI & ML interests
None yet
Recent Activity
updated
a collection
6 days ago
HF Daily
updated
a collection
6 days ago
HF Daily
updated
a collection
6 days ago
HF Daily
Organizations
None yet