papers - a estudanteBr Collection

estudanteBr 's Collections

papers

papers

updated 8 days ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published 14 days ago • 406
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Paper • 2510.07499 • Published 11 days ago • 44
Improving Context Fidelity via Native Retrieval-Augmented Reasoning

Paper • 2509.13683 • Published Sep 17 • 8
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering

Paper • 2509.00798 • Published Aug 31
Retrieval Feedback Memory Enhancement Large Model Retrieval Generation Method

Paper • 2508.17862 • Published Aug 25
Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction

Paper • 2509.03540 • Published Aug 31
Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation

Paper • 2508.09755 • Published Aug 13
MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains

Paper • 2508.18260 • Published Aug 25
From Ranking to Selection: A Simple but Efficient Dynamic Passage Selector for Retrieval Augmented Generation

Paper • 2508.09497 • Published Aug 13
MemMamba: Rethinking Memory Patterns in State Space Model

Paper • 2510.03279 • Published 22 days ago • 68
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6, 2024 • 63
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning

Paper • 2502.15425 • Published Feb 21 • 9
Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 84
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 166
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Paper • 2406.06469 • Published Jun 10, 2024 • 29
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots

Paper • 2409.10277 • Published Sep 16, 2024 • 1
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Paper • 2504.17432 • Published Apr 24 • 39
ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26 • 45
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

Paper • 2509.23808 • Published 22 days ago • 47
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models

Paper • 2510.03561 • Published 16 days ago • 23
JULI: Jailbreak Large Language Models by Self-Introspection

Paper • 2505.11790 • Published May 17
Reward Reasoning Model

Paper • 2505.14674 • Published May 20 • 37
CodeContests+: High-Quality Test Case Generation for Competitive Programming

Paper • 2506.05817 • Published Jun 6 • 9
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 236
Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot

Paper • 2506.14641 • Published Jun 17
The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs

Paper • 2507.07562 • Published Jul 10 • 1
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation

Paper • 2506.17088 • Published Jun 20
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 141
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 135
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 88
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23 • 81
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 185
MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4 • 153
A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17 • 257
Fast-dLLM v2: Efficient Block-Diffusion LLM

Paper • 2509.26328 • Published 20 days ago • 47
CoDA: Coding LM via Diffusion Adaptation

Paper • 2510.03270 • Published 23 days ago • 41
Drax: Speech Recognition with Discrete Flow Matching

Paper • 2510.04162 • Published 15 days ago • 25
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

Paper • 2510.04081 • Published 15 days ago • 19