TheOneTrueNiz 's Collections Papers
updated
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs
via Bi-Mode Annealing and Reinforce Learning
Paper
• 2508.21113
• Published
• 110
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement
Learning for General LLM Reasoning
Paper
• 2508.16949
• Published
• 24
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for
General Robot Control
Paper
• 2508.21112
• Published
• 77
UItron: Foundational GUI Agent with Advanced Perception and Planning
Paper
• 2508.21767
• Published
• 12
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn
Tool-Integrated Reasoning
Paper
• 2509.02479
• Published
• 84
K2-Think: A Parameter-Efficient Reasoning System
Paper
• 2509.07604
• Published
• 14
THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical
Reasoning
Paper
• 2509.13761
• Published
• 16
FlowRL: Matching Reward Distributions for LLM Reasoning
Paper
• 2509.15207
• Published
• 116
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic
Data and Scalable Reinforcement Learning
Paper
• 2509.13305
• Published
• 91
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid
Vision Tokenizer
Paper
• 2509.16197
• Published
• 58
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization
Paper
• 2509.13313
• Published
• 80
Understanding the Thinking Process of Reasoning Models: A Perspective
from Schoenfeld's Episode Theory
Paper
• 2509.14662
• Published
• 13
Meta-R1: Empowering Large Reasoning Models with Metacognition
Paper
• 2508.17291
• Published
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
Paper
• 2509.25760
• Published
• 55
Diffusion Transformers with Representation Autoencoders
Paper
• 2510.11690
• Published
• 166