Collections
Discover the best community collections!
Collections including paper arxiv:2501.09223
-
STaR: Bootstrapping Reasoning With Reasoning
Paper • 2203.14465 • Published • 8 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 11 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper • 2411.14405 • Published • 62
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 373 -
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 151 -
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Paper • 2409.12122 • Published • 4 -
Qwen2.5-VL Technical Report
Paper • 2502.13923 • Published • 200
-
A Single Transformer for Scalable Vision-Language Modeling
Paper • 2407.06438 • Published • 1 -
Building and better understanding vision-language models: insights and future directions
Paper • 2408.12637 • Published • 132 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 234 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 146
-
Language Models: A Guide for the Perplexed
Paper • 2311.17301 • Published -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 66 -
Reinforcement Learning: An Overview
Paper • 2412.05265 • Published • 8 -
A Primer on Large Language Models and their Limitations
Paper • 2412.04503 • Published
-
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper • 2104.09864 • Published • 14 -
Attention Is All You Need
Paper • 1706.03762 • Published • 77 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 62 -
Zero-Shot Tokenizer Transfer
Paper • 2405.07883 • Published • 5
-
A Single Transformer for Scalable Vision-Language Modeling
Paper • 2407.06438 • Published • 1 -
Building and better understanding vision-language models: insights and future directions
Paper • 2408.12637 • Published • 132 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 234 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 146
-
STaR: Bootstrapping Reasoning With Reasoning
Paper • 2203.14465 • Published • 8 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 11 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper • 2411.14405 • Published • 62
-
Language Models: A Guide for the Perplexed
Paper • 2311.17301 • Published -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 66 -
Reinforcement Learning: An Overview
Paper • 2412.05265 • Published • 8 -
A Primer on Large Language Models and their Limitations
Paper • 2412.04503 • Published
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 373 -
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 151 -
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Paper • 2409.12122 • Published • 4 -
Qwen2.5-VL Technical Report
Paper • 2502.13923 • Published • 200
-
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper • 2104.09864 • Published • 14 -
Attention Is All You Need
Paper • 1706.03762 • Published • 77 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 62 -
Zero-Shot Tokenizer Transfer
Paper • 2405.07883 • Published • 5