-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 21 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
Collections
Discover the best community collections!
Collections including paper arxiv:2510.02297
-
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Paper • 2411.04952 • Published • 30 -
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
Paper • 2411.05005 • Published • 13 -
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Paper • 2411.04075 • Published • 17 -
Self-Consistency Preference Optimization
Paper • 2411.04109 • Published • 19
-
facebook/w2v-bert-2.0
Feature Extraction • 0.6B • Updated • 1.01M • 189 -
facebook/metaclip-h14-fullcc2.5b
Zero-Shot Image Classification • 1.0B • Updated • 16.9k • 44 -
openai/clip-vit-large-patch14
Zero-Shot Image Classification • 0.4B • Updated • 9.34M • 1.88k -
Salesforce/blip-image-captioning-large
Image-to-Text • 0.5B • Updated • 1.14M • 1.42k
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 235 • 96 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Paper • 2402.17193 • Published • 26 -
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 63 -
Direct Preference Optimization Using Sparse Feature-Level Constraints
Paper • 2411.07618 • Published • 17 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 54
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 16 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 16
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 21 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 235 • 96 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Paper • 2411.04952 • Published • 30 -
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
Paper • 2411.05005 • Published • 13 -
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Paper • 2411.04075 • Published • 17 -
Self-Consistency Preference Optimization
Paper • 2411.04109 • Published • 19
-
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Paper • 2402.17193 • Published • 26 -
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 63 -
Direct Preference Optimization Using Sparse Feature-Level Constraints
Paper • 2411.07618 • Published • 17 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 54
-
facebook/w2v-bert-2.0
Feature Extraction • 0.6B • Updated • 1.01M • 189 -
facebook/metaclip-h14-fullcc2.5b
Zero-Shot Image Classification • 1.0B • Updated • 16.9k • 44 -
openai/clip-vit-large-patch14
Zero-Shot Image Classification • 0.4B • Updated • 9.34M • 1.88k -
Salesforce/blip-image-captioning-large
Image-to-Text • 0.5B • Updated • 1.14M • 1.42k
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 16 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 16