1 225 742

Motoki Wu

tokestermw

https://motoki.co

AI & ML interests

None yet

Recent Activity

liked a model about 15 hours ago

MiniMaxAI/MiniMax-M2.1

liked a model about 15 hours ago

zai-org/GLM-4.7

upvoted a paper 15 days ago

Reinforcement Learning for Self-Improving Agent with Skill Library

View all activity

Organizations

upvoted a paper 15 days ago

Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published 22 days ago • 32

upvoted an article 24 days ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

25 days ago

•

104

upvoted an article 29 days ago

Article

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

about 1 month ago

•

upvoted an article 30 days ago

Article

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

Dec 8, 2025

•

upvoted an article about 2 months ago

Article

Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks

Nov 21, 2025

•

upvoted a collection 2 months ago

PromptMII

Collection

Prompt-MII: Meta-Learning Instruction Induction for LLMs. Link to paper: https://arxiv.org/abs/2510.16932 • 4 items • Updated Oct 21, 2025 • 2

upvoted a paper 3 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

upvoted an article 3 months ago

Article

mem-agent: Equipping LLM Agents with Memory Using RL

Oct 9, 2025

•

upvoted a paper 3 months ago

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6, 2025 • 128

upvoted a collection 4 months ago

Qwen3-Omni

Collection

6 items • Updated 9 days ago • 179

upvoted 3 papers 4 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4, 2025 • 195

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 228

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28, 2025 • 110

upvoted 2 papers 5 months ago

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published Aug 23, 2025 • 23

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22, 2025 • 160

upvoted a collection 5 months ago

NVIDIA Nemotron V2

Collection

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 17 days ago • 100

upvoted 4 papers 5 months ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published Aug 14, 2025 • 28

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 180

Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments

Paper • 2508.08791 • Published Aug 12, 2025 • 16

Motoki Wu

AI & ML interests

Recent Activity

Organizations

tokestermw's activity

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks

mem-agent: Equipping LLM Agents with Memory Using RL