Space for RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
View all Papers models 5
cx-cmu/repro-rephraser-4B
Text Generation • 196k • Updated
• 2.7k • 2
cx-cmu/repro-rephraser-1B
Text Generation • 1B • Updated
• 13
cx-cmu/AutoGEO_mini_Qwen1.7B_Ecommerce
Text Generation • 2B • Updated
• 7
cx-cmu/AutoGEO_mini_Qwen1.7B_GEOBench
Text Generation • 2B • Updated
• 13
cx-cmu/AutoGEO_mini_Qwen1.7B_ResearchyGEO
Text Generation • 2B • Updated
• 12
datasets 9
cx-cmu/deepresearchgym-agentic-search-logs
Viewer
• Updated
• 14.3M • 120 • 12
cx-cmu/Researchy-GEO
Viewer
• Updated
• 47k • 49 • 1
cx-cmu/GEO-Bench
Viewer
• Updated
• 37.4k • 31
cx-cmu/E-commerce
Viewer
• Updated
• 7.97k • 34
cx-cmu/ClueWeb-Reco
Viewer
• Updated
• 87.2M • 52 • 1
cx-cmu/repro-organic-data-72B
Viewer
• Updated
• 58.3M • 413
cx-cmu/repro-rl-data
Viewer
• Updated
• 41k • 20
cx-cmu/repro-rephrased-data-72B
Viewer
• Updated
• 39M • 552
cx-cmu/CLUE-LLM
Viewer
• Updated
• 1.21k • 11